Contemporary Ergonomics 1998-M.a.hanson

Download as pdf or txt
Download as pdf or txt
You are on page 1of 624

CONTEMPORARY ERGONOMICS 1998

CONTEMPORARY ERGONOMICS 1998

Proceedings of the Annual Conference


of the Ergonomics Society
Royal Agricultural College
Cirencester
1–3 April 1998

Edited by

M.A.HANSON
Institute of Occupational Medicine
Edinburgh

THE Ergonomics
society
UK Taylor & Francis Ltd, 1 Gunpowder Square, London EC4A 3DE

USA Taylor & Francis Inc., 1900 Frost Road, Suite 101, Bristol, PA
19007–1598

This edition published in the Taylor & Francis e-Library, 2004.

Copyright © Taylor & Francis Ltd 1998


except papers by P.J.Goillau et al., R.S.Harvey, and two papers
by L.M.Bouskill et al. © British Crown Copyright 1998/DERA
published with the permission of the Controller of Her Britannic
Majesty’s Stationery Office. And except the paper by L.A.Morris
© British Crown Copyright 1998 reproduced with the permission
of the Controller of Her Britannic Majesty’s Stationery Office. The
views expressed are those of the author and do not necessarily
reflect the views or policy of the Health & Safety Executive or any
other government department.

All rights reserved. No part of this publication may


be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording or otherwise,
without the prior permission of the publisher.

A catalogue record for this book is available from the


British Library.

ISBN 0-203-21201-0 Master e-book ISBN

ISBN 0-203-26949-7 (Adobe eReader Format)


ISBN 0-7484-0811-8 (Print Edition)

Cover design by Hybert Design


Preface

Contemporary Ergonomics 1998 are the proceedings of the Annual Conference of the
Ergonomics Society, held in April 1998 at the Royal Agricultural College,
Cirencester. The conference is a major international event for Ergonomists
and Human Factors Specialists, and attracts contributions from around the
world.

Papers are chosen by a selection panel from abstracts submitted in the autumn
of the previous year and the selected papers are published in Contemporary
Ergonomics. Papers are submitted as camera ready copy prior to the
conference. Each author is responsible for the presentation of their paper.
Details of the submission procedure may be obtained from the Ergonomics
Society.

The Ergonomics Society is the professional body for Ergonomists and Human
Factors Specialists based in the United Kingdom. It also attracts members
throughout the world and is affiliated to the International Ergonomics
Association. It provides recognition of competence of its members through the
Professional Register. For further details contact:

The Ergonomics Society


Devonshire House
Devonshire Square
Loughborough
Leicestershire
LE11 3DW
UK

Tel/Fax (+44) 1509 234 904


Contents

HUGH STOCKBRIDGE MEMORIAL SESSION


Ergonomics and Standards

Introduction
DM Anderson 2
Ergonomics standards—the good, the bad and the ugly
T Stewart 3
International standardisation of graphical symbols for consumer products
FR Brigham 8
The UK human factors defence standard: past, present and future
RS Harvey 13

STEPHEN PHEASANT MEMORIAL SESSION


The Contribution of Ergonomics to the Understanding
and Prevention of Musculoskeletal Disorders

Introduction
S Lee and D Stubbs 20
The role of physical aspects
PW Buckle 21
The combined effects of physical and psychosocial work factors
J Devereux 25
The role of psychosocial factors
AK Burton 30

MUSCULOSKELETAL DISORDERS

Interpreting the extent of musculoskeletal complaints


C Dickinson 36
People in pain
DM Anderson 41
Prevention of musculoskeletal disorders in the workplace—a strategy for UK
research
LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson 46
A musculoskeletal risk screening tool for automotive line managers
A Wilkinson, RJ Graves, S Chambers and R Leaver 51
Risk assessment design for musculoskeletal disorders in healthcare
professionals
C Beynon, D Leighton, A Nevill and T Reilly 56
Ergonomic microscopes—solutions for the cyto-screener?
JL May and AG Gale 61
Musculoskeletal discomfort from dancing in nightclubs
SL Durham and RA Haslam 66
MANUAL HANDLING

Is the ergonomic approach advocated in the Manual Handling Regulations


being adopted?
KM Tesh 72
Control of manual handling risks within a soft drinks distribution centre
EJ Wright and RA Haslam 77
Training and patient-handling: an investigation of transfer
JA Nicholls and MA Life 82
Risk management in manual handling for community nurses
P Alexander 87
Children’s natural lifting patterns: an observational study
F Cowieson 92
Manual handling and lifting during the later stages of pregnancy
T Reilly and SA Cartwright 96
Posture analysis and manual handling in nursery professionals
JO Crawford and RM Lane 101

POSTURE

Can orthotics play a beneficial role during loaded and unloaded walking?
DC Tilbury-Davis, RH Hooper and MGA Llewellyn 108
Investigation of spinal curvature while changing one’s posture during sitting
FS Faiks and SM Reinecke 113
The effect of load size and form on trunk asymmetry while lifting
G Thornton and J Jackson 118
The effect of vertical visual target location on head and neck posture
R Burgess-Limerick, A Plooy and M Mon-Williams 123

OFFICE ERGONOMICS

Is a prescription of physical changes sufficient to eliminate health and safety


problems in computerised offices?
RM Sharma 130
An evaluation of a trackball as an ergonomic intervention
B Haward 135
Old methods, new chairs. Evaluating six of the latest ergonomic chairs for
the modern office
A Esnouf and JM Porter 140

NEW TECHNOLOGY

Development of a questionnaire to measure attitudes towards virtual reality


S Nichols 146
Orientation of blind users on the World Wide Web
MP Zajicek, C Powell and C Reeves 151
“Flash, splash and crash”: Human factors and the implementation of
innovative Web technologies
A Pallant and G Rainbird 156

WORK STRESS

Determining ergonomic factors in stress from work demands of nurses


DW Jamieson and RJ Graves 162
A risk assessment and control cycle approach to managing workplace stress
RJ Lancaster 167

TELEWORKING

Teleworking: Assessing the risks


M Kerrin, K Hone and T Cox 174
Evaluating teleworking—case study
S Campion and A Clarke 179

TEAM WORKING

Team organisational mental models: an integrative framework for research


J Langan-Fox, S Code and G Edlund 186
The impact of IT&T on virtual team working in the European automotive
industry
C Carter and A May 191

WORK DESIGN

The effect of communication processes upon workers and job efficiency


A Dickens and C Baber 198
A case study of job design in a steel plant
HT Neary and MA Sinclair 203
The effects of age and habitual physical activity on the adjustment to
nocturnal shiftwork
T Reilly, A Coldwells, G Atkinson and J Waterhouse 208
Job design for university technicians: work activity and allocation of function
RF Harrison, A Dickens and C Baber 213

SYSTEM DESIGN AND ANALYSIS

Allocation of functions and manufacturing job design based on knowledge


requirements
CE Siemieniuch, MA Sinclair and GMC Vaughan 220
The need to specify cognition within system requirements
IS MacLeod 225
Analysis of complex communication tasks
J Wikman 230

INFORMATION SYSTEMS

Health and safety as the basis for specifying information systems design
requirements
TG Gough 236
Cognitive algorithms
R Huston, R Shell and AM Genaidy 241

DESIGN METHODS

Rapid prototyping in foam of 3D anthropometric computer models in


functional postures
S Peijs, JJ Broek and PN Hoekstra 248
The use of high and low level prototyping methods for product user
interfaces
JVH Bonner and P Van Schaik 253
Creative collaboration in engineering design teams
F Reid, S Reed and J Edworthy 258

DESIGN AND USABILITY

Pleasure and product semantics


PW Jordan and AS Macdonald 264
A survey of usability practice and needs in Europe
MC Maguire and R Graham 269
Cultural influence in usability assessment
A Yeo, R Barbour and M Apperley 274

INTERFACE DESIGN

Interface display designs based on operator knowledge requirements


F Sturrock and B Kirwan 280
Understanding what makes icons effective: how subjective ratings can
inform design
SJP McDougall, MB Curry and O de Bruijn 285
Representing uncertainty in decision support systems: the state of the art
C Parker 290
Representing reliability of at-risk information in tactical displays for fighter
pilots
M Piras, S Selcon, J Crick and IRL Davies 295
Semantic content analysis of task conformance
A Totter and C Stary 300

WARNINGS

Warnings: a task-oriented design approach


JM Noyes and AF Starr 306
Effects of auditorily-presented warning signal words on intended carefulness
RS Barzegar and MS Wogalter 311
Listeners’ understanding of warning signal words
J Edworthy, W Clift-Matthews and M Crowther 316
Perceived hazard and understandability of signal words and warning
pictorials by Chinese community in Britain
AKP Leung and E Hellier 321

VERBAL PROTOCOL ANALYSIS

Thinking about thinking aloud


MJ Rooden 328
Adjusting the cognitive walkthrough using the think-aloud method
M Verbeek and H van Oostendorp 333
Verbal protocol data for heart and lung bypass scenario simulation “scripts”
J Lindsay and C Baber 338
Use of verbal protocol analysis in the investigation of an order picking task
B Ryan and CM Haslegrave 343
PARTICIPATORY ERGONOMICS

Selecting areas for intervention


BL Somberg 350
Participatory ergonomics in the construction industry
AM de Jong, P Vink and WF Schaefer 355
User trial of a manual handling problem and its “solution”
D Klein, WS Green and H Kanis 360

INDUSTRIAL APPLICATIONS

Case study: a human factors safety assessment of a heavy lift operation


WI Hamilton and P Charles 366
The application of ergonomics to volume high quality sheet printing and
finishing
ML Porter 371
The application of human factors tools and techniques to the specification of
an oil refinery process controller role
J Edmonds and C Duggan 376
Feasibility study of containerisation for Travelling Post Office operations
G Rainbird and J Langford 381

MILITARY APPLICATIONS

The complexities of stress in the operational military environment


MI Finch and AW Stedmon 388
The development of physical selection procedures. Phase 1: job analysis
MP Rayson 393
The human factor in applied warfare
AE Birkbeck 398

AIR TRAFFIC MANAGEMENT

Getting the picture—Investigating the mental picture of the air traffic


controller
B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and
A Phillips 404
Developing a predictive model of controller workload in air traffic
management
AR Kilner, M Hook, P Fearnside and P Nicholson 409
Assessing the capacity of Europe’s airspace: The issues, experience and a
method using a controller workload model
A Majumdar 414
Evaluation of virtual prototypes for air traffic control—the MACAW technique
PJ Goillau, VG Woodward, CJ Kelly and GM Banks 419
Development of an integrated decision making model for avionics application
D Donnelly, JM Noyes and DM Johnson 424
Psychophysiological measures of fatigue and somnolence in simulated air
traffic control
H David, P Cabon, S Bourgeois-Bougrine and R Mollard 429
DRIVERS AND DRIVING

What’s skill got to do with it? Vehicle automation and driver mental workload
M Young and N Stanton 436
The use of automatic speech recognition in cars: a human factors review
R Graham 441
Integration of the HMI for driver systems: classifying functionality and
dialogue
T Ross 446
Subjective symptoms of fatigue among commercial drivers
PA Desmond 451
How did I get here? Driving without attention mode
JL May and AG Gale 456
Seniors’ driving style and overtaking: is there a “comfortable traffic hole”?
T Wilson 461
Speed limitation and driver behaviour
D Haigney and RG Taylor 466
The ergonomics implications of conventional saloon car cabins on police
drivers
SM Lomas and CM Haslegrave 471
The design of seat belts for tractors
DH O’Neill and BJ Robinson 476

NOISE AND VIBRATION

Auditory distraction in the workplace: a review of the implications from


laboratory studies
S Banbury and D Jones 482
Transmission of shear vibration through gloves
GS Paddan and MJ Griffin 487
The effect of wrist posture on attenuation of vibration in the hand-arm system
TK Fredericks and JE Fernandez 492

HAND TOOLS

Criteria for selection of hand tools in the aircraft manufacturing industry: a


review
BP Kattel and JE Fernandez 498
Exposure assessment of ice cream scooping tasks
PG Dempsey, R McGorry, J Cotnam and I Bezverkhny 503

THERMAL ENVIRONMENTS

The effect of clothing fit on the clothing Ventilation Index


LM Bouskill, N Sheldon, KC Parsons and WR Withey 510
A thermoregulatory model for predicting transient thermal sensation
F Zhu and N Baker 515
The user-oriented design, development and evaluation of the clothing
envelope of thermal performance
D Bethea and KC Parsons 520
A comparison of the thermal comfort of different wheelchair seating
materials and an office chair
N Humphreys, LH Webb and KC Parsons 525
The effect of repeated exposure to extreme heat by fire training officers
JO Crawford and TJ Milne 530
The effects of self-contained breathing apparatus on gas exchange and heart
rate during fire-fighter simulations
KJ Donovan and AK McConnell 535
The effect of external air speed on the clothing ventilation index
LM Bouskill, R Livingston, KC Parsons and WR Withey 540

COMMUNICATING ERGONOMICS

Commercial planning and ergonomics


J Dillon 546
Human factors and design: Bridging the communication gap
AS Macdonald and PW Jordan 551
Guidelines for addressing ergonomics in development aid
T Jafry and DH O’Neill 556
Determining and evaluating ergonomic training needs for design engineers
J Ponsonby and RJ Graves 560
Ergonomic ideals vs genuine constraints
D Robertson, S Layton and J Elder 565

GENERAL ERGONOMICS

Another look at Hick-Hyman’s reaction time law


TO Kvålseth 572
Design relevance of usage centred studies at odds with their scientific
status?
H Kanis 577
The integration of human factors considerations into safety and risk
assessment systems
JL Williamson-Taylor 582
The use of defibrillator devices by the lay public
T Gorbell and R Benedyk 587
Occupational disorders in Ghanaian subsistence farmers
M McNeill and DH O’Neill 592

AUTHOR INDEX 599

SUBJECT INDEX 603


HUGH STOCKBRIDGE
MEMORIAL SESSION:

ERGONOMICS AND
STANDARDS
HUGH STOCKBRIDGE MEMORIAL LECTURES

Ergonomics and Standards

Introduction
Towards the end of his career, Hugh Stockbridge became known for, amongst other things,
his keen interest in ergonomics standards, but he was not originally a ‘standards man’. In fact
his individualistic style was very far from standard. In the days before the publication of
Murrel’s book Ergonomics (Chapman and Hall, 1965), we had only one ‘cookbook’
published in the UK that could be seen as a standard. This was the 1960 publication from the
MRC/RNPRC: Human Factors in Design and Use of Naval Equipment, intended for use by
Royal Navy designers. Hugh, however, was ever the innovative experimenter in the grand
tradition of Cambridge psychologists, and where ergonomics data was lacking, produced
some of his own. An example was the designs for micro shape coded knobs, intended to be
used on the old ‘Post office keys’, published internally at Farnborough in 1957. His
continuing interest in factors affecting the design of indicators and such controls was
later reflected in a paper with Bernard Chambers in the Journal Ergonomics (Taylor and
Francis, 1970).

By the end of the ‘60s, Hugh was already involved in his great project, as a member of a working
party of the RNPRC to carry out a revision of the earlier handbook. The working party carried
out a thorough review of the proliferating, human engineering handbooks and surveys, and by
consideration of best ergonomics principles, conceived a new publication of 11 Chapters, to be
produced in a ring binding to facilitate updating and additions. But that is the subject of Roger
Harvey’s paper in this memorial session. Hugh also made a particularly interesting, and still
valid, critique of ergonomics handbooks and journals to Human Engineering, by Kraiss and
Moraal (Verlag TüV Rheinland Gmbh, 1975). Much later, Hugh became secretary of the Study
Group concerned with the creation of DBF STAN 00–25, and to quote from a contemporary
colleague, ‘…his cryptic minutes were a work of art—inimitable, yet very informative once
decoded and embellished with the knowledge of those who had the need to know.’

The other two papers are complementary in many ways. Tom Stewart covers the considerable
time he has spent developing and using more general International Standards in ergonomics.
It will be especially interesting to hear his comments on the application of ergonomics
principles to creating standards—and the standards he considers have made a negative
contribution to ergonomics! The final paper, contributed by Fred Brigham, illustrates
amongst other things the particular problems encountered in the development of quite
specific standards for graphical symbols for use on consumer products intended for an
international market. The promotion of appropriate testing procedures to ensure usability of
such symbols must be a very important and interesting part of the standard making process.
Hugh would have been proud of this session.
ERGONOMICS STANDARDS—
THE GOOD, THE BAD AND THE UGLY

Tom Stewart

Managing Director
System Concepts Limited
2 Savoy Court, Strand
London WC2R OEZ
www.system-concepts.com

Hugh Stockbridge was an active supporter of ergonomics standardisation. In


this presentation, I will draw some conclusions about the process of
developing standards (particularly International Standards) and about the
usefulness and usability of the resulting standards themselves. As the title
suggests, there can be major problems with both the process and the
standards (the Ugly) or there can be minor problems with both (the Bad).
However, most of the time, the process works well and the resulting
standards have improved the ergonomics quality of products and systems
and are well received by industry and users (the Good).

Introduction
One of Hugh Stockbridge’s most endearing qualities (apart from his sense of humour) was
his enthusiasm for irreverence. I believe he would have approved of my choice of title and
been pleased that his memory was being cherished in this session although he was also at
home on the stage. This session is like a posthumous award for Hugh and it is clear that there
are some similarities between standards and show business.
Just as awards ceremonies often seem rather incestuous, esoteric and irrelevant to the real
world, many standards seem to be aimed more at ergonomists and their concerns (in their
terminology, structure and emphasis) than at standards users in the real world. Taking the
analogy of awards a step further, I will therefore review ‘The Good, The Bad and The Ugly’
aspects of ergonomics standardisation in reverse order. These observations are based on my
own experience as Chairman of ISO/TC 159/SC 4 Ergonomics of Human System Interaction
and as an active developer of ergonomics standards for more than 15 years. But before
exposing the sordid side of standards making, I would like to explain what the process of
International Standardisation should involve.

The process of International Standardisation


International standards are developed over a period of several years and in the early stages,
the published documents may change dramatically from version to version until consensus is
reached (usually within a Working Group of experts). As the standard becomes more mature
(from the Committee Draft Stage onwards), formal voting takes place (usually within the
4 T Stewart

parent sub-committee) and the draft documents provide a good indication of what the final
standard is likely to look like. Table 1 shows the main stages.

Table 1 The main stages of ISO standards development

The Ugly side of standards


Although ergonomics standards are generally concerned with such mundane topics as
keyboard design or menu structures, they nonetheless generate considerable emotion amongst
standards makers. Sometimes this is because the resulting standard could have a major
impact on product sales or legal liabilities. Other times the reason for the passion is less clear.
Nonetheless, the strong feelings may result in what I have called the ugly side of standards. In
terms of the standardisation process, the ugly side includes:

u large multinational companies exerting undue influence by dominating national


committees. Although draft standards are usually publicly available from national
standards bodies, they are not widely publicised. This means that it is relatively easy for
well informed large companies to provide sufficient experts at the national level to ensure
that they can virtually dictate the final vote and comments from a country.

u end user’s requirements being compromised as part of ‘horse trading’ between


conflicting viewpoints. In the interests of reaching agreement, delegates may resort to
making political trade-offs largely independent of the technical merits of the issue.

u national pride leading to uncritical support for a particular approach or


methodology. In theory, participants in Working Group meetings are experts nominated
by member bodies in the different countries. They are not there to represent a national
viewpoint but are supposed to act as individuals. However, as one disillusioned expert
explained to me ‘sometimes the loudest noise at a Working Group meeting is the grinding
of axes’
Ergonomics standards—the good, the bad and the ugly 5

However, it is not just the process which is ugly. The standards themselves can leave much to
be desired in terms of brevity, clarity and usability as a result of:

u stilted language and boring formats. The unfriendliness of the language is illustrated by
the fact that although the organisation is known by the acronym ISO, its full English title
is the International Organisation for Standardisation. The language and style are governed
by a set of Directives and these encourage a wordy and impersonal style.

u problems with translation and the use of ‘Near English’. There are three official
languages in ISO—English, French and Russian. In practice, much of the work is
conducted in English, often by non-native speakers. As someone who only speaks
English, I have the utmost respect for those who can work in more than one language.
However, the result of this is that the English used in standards is often not quite correct—
it is ‘near English’. The words are usually correct but the combination often makes the
exact meaning unclear. These problems are exacerbated when the text is translated.

u confusions between requirements and recommendations. In ISO standards, there are


usually some parts which specify what has to be done to conform to the standard. These
are indicated by the use of the word ‘shall’. However, in ergonomics standards, we often
want to make recommendations as well. These are indicated by the use of the word
‘should’. Such subtleties are often lost on readers of standards, especially those in
different countries. For example, in the Nordic countries, they follow recommendations
(shoulds) as well as requirements (shalls), so the distinction is diminished. In the USA,
they tend to ignore the ‘shoulds’ and only act on the ‘shalls’.

The Bad side of standards


I used the expression ‘ugly’ to describe the result of extreme passion in the development of
standards. In this part of the paper, I discuss what might be seen as the result of too little
passion. The bad side is that standardisation is very slow as a result of :

u an apparently leisurely pace of work. One of the reasons is that there is an extensive
consultation period at each stage with time being allowed for national member bodies to
circulate the documents to mirror committees and then to collate their comments. Another
reason is that Working Group members can spend a great deal of time working on drafts
and reaching consensus only to find that the national mirror committees reject their work
when it comes to the official vote. It is particularly frustrating for project editors to receive
extensive comments (which must be answered) from countries who do not send experts to
participate in the work. Of course, the fact that the work is usually voluntary means that it
is difficult to get people to agree to work quickly.

u too many experts. This might sound like an unlikely problem but given the long
timescale mentioned above it can be a significant factor in slowing down the process. The
reason is that many experts are only supported by their organisations for a relatively short
time and are then replaced by other experts. Every time a new expert joins the Working
6 T Stewart

Group, there is a tendency to spend a lot of time explaining the history and to some extent
starting the process again. Similarly, each expert feels obliged to make an impact and
suggest some enhancement or change in the standard under development. Since the
membership of Working Groups can change at virtually every meeting (which are usually
three or four months apart), it is not uncommon for long standing members finding
themselves reinstating material which was deleted two or three meetings previously (as a
result of a particularly forceful individual).

While I do not accept that we have produced bad standards (at least in our committee), our standards
have been criticised for being too generous to manufacturers in some areas and too restrictive in
other areas. The ‘over-generous’ criticism misses the point that most standards are setting minimum
requirements and in ergonomics we must be very cautious about setting such levels. However,
there certainly are areas where being too restrictive is a problem. Examples include:

u ISO 9241–3:1992 Ergonomics requirements for work with VDTs: Display


Requirements. This standard has been successful in setting a minimum standard for
display screens which has helped purchasers and manufacturers. However, it is biased
towards Cathode Ray Tube (CRT) display technology. An alternative method of
compliance based on a performance test (which would be technology independent) is still
under development and is unlikely to be finalised in the near future.

u ISO CD 9241–9 Ergonomics requirements for work with VDTs: Non keyboard input
devices. This standard is suffering because technological development is faster than either
ergonomics research or standards making. Although there is an urgent need for a standard
to help users to be confident in the ergonomic claims made for new designs of mice and
other input devices, the lack of reliable data forces the standards makers to slow down or
run the risk of prohibiting newer, even better solutions.

The Good side of standards


I would not spend my time (largely unfunded) developing standards if I did not believe that
they are largely good for ergonomics. Major strengths in the process are that it is:

u based on consensus. Manufacturers (and ergonomists) make wildly different claims


about what represents good ergonomics. This is a major weakness for our customers who
may conclude that all claims are equally valid and there is no sound basis for any of it.
Standards force a consensus and therefore have real authority in the minds of our
customers. Achieving consensus requires compromises, but then so does life.

u international. Although there are national and regional differences in populations, the
world is becoming a single market with the major suppliers taking a global perspective.
Variations in national standards and requirements not only increase costs and complexity,
they also tend to compromise individual choice. Making standards international is one
way of ensuring that they have impact and can help improve the ergonomics quality of
products for everyone.
Ergonomics standards—the good, the bad and the ugly 7

We have produced a number of useful standards over the past few years. These are not only
useful in providing technical information in their own right but serve to ensure that
ergonomics issues are firmly placed on management agendas. Many organisations feel
obliged to take standards seriously and therefore even if they were not predisposed towards
ergonomics initially, the existence of International Standards ensures that they are given due
consideration. As consultants, we know that basing our recommendations on agreed
standards gives them far greater authority than citing relevant research.
There is not space in this paper to list all the relevant standards but a few key examples
include three from the ISO 9241 series and ISO 13407.

u ISO 9241–2:1992 Guidance on Task Requirements. This standard sets out key points
on job and task design and provides a sound basis for persuading managers and system
developers that such issues require proper attention if systems are to be successful.

u ISO 9241–3:1992 Visual Display Requirements. This standard allows purchasers to


have some confidence in the ergonomic quality of computer displays. This is particularly
important for managers who wish to meet their obligations under the European Directive
on work with display screen equipment.

u ISO 9241–10:1996 Dialogue Principles. This standard sets out some key principles of
dialogue design and gives useful examples to illustrate how the relatively simple
principles apply in practice. The European Directive also requires employers to ensure
that systems meet the principles of software ergonomics. This standard gives them an
external benchmark which they can incorporate in procurement specifications.

u ISO DIS 13407 Human Centred Design for Interactive Systems. This standard is an
attempt to solve the problem of developing ergonomics standards quickly enough in a fast
changing technical environment. The standard provides guidance for project managers to
help them follow a human-centred design process. By undertaking the activities and following
the principles described in the standard, managers can be confident that the resulting systems
will be usable and work well for their users. If their customers require evidence of human-
centredness, the standard gives guidance on how to document the process.

The way forward


Although I believe standards are an important tool for the ergonomist, many people find them
difficult to understand and use. In part, this is because people sometimes expect too much from
standards. They cannot represent the latest ideas and they are not going to help much with the
more creative parts of design. However, they often represent important constraints and may give
some guidance on what has worked in the past. The best way to really understand what is going
on in standards is to get involved. This will give you advance warning of future standards, the
opportunity to influence the content of standards and an understanding of the context in which
they have been developed. You will then find it much easier to make effective use of standards.
If you do not know who to contact, let me know what you are interested in helping with (email
tom@systemconcepts.com) and I’ll send you details.
INTERNATIONAL STANDARDISATION OF GRAPHICAL
SYMBOLS FOR CONSUMER PRODUCTS

Fred Brigham

Philips Design, Building HWD, PO Box 218,


5600 MD Eindhoven, The Netherlands

The paper discusses international activities concerned with the


standardisation of graphical symbols and complementary activities in a
major electronics company focussing on symbols used for consumer
products. Issues relating to the practical application of symbols in an
industrial setting are described, including procedures for new symbols and
tools to provide worldwide access. Results from user tests of a proposal for
international symbols are presented to illustrate some of the problems
involved in designing comprehensible symbols. The paper concludes by
stressing the need to focus on the communicative processes involved.

Introduction
The main organisations concerned with the international standardisation of graphical
symbols are as follows:

International Organization for Standardization (ISO)


The main ISO technical committee dealing with graphical symbols is ISO TC145, which has
three subcommittees dealing with public information symbols (SC1), safety signs and
symbols (SC2), and graphical symbols for use on equipment (SC3). The main publication
containing graphical symbols for use on equipment is ISO 7000.
The symbols for use on equipment are developed by the technical committee responsible
for the equipment following the rules in ISO 3461–1 (General principles for the creation of
graphical symbols) and ISO 4196 (Use of arrows).

International Electrotechnical Commission (IEC)


The IEC committee responsible for graphical symbols for use on electrotechnical equipment
is IEC SC3C. New symbols are proposed by technical committees or by national
standardisation organisations, and SC3C plays an active role in the approval procedure.
The main publication containing graphical symbols for use on electrotechnical equipment
is IEC 60417. This document is produced electronically from a database. The database may
be linked to a web site in the near future allowing users to search for suitable symbols and
either download the drawings or order them on CD-ROM.
Standardisation of graphical symbols for consumer products 9

ISO/IEC Joint Technical Committee 1


This joint technical committee is concerned with information technology and, because of its
size and importance, can be considered separately from the two parent organisations. JTC1
Working Group 5 is responsible for graphical symbols for office equipment and also icons. A
collective standard containing graphical symbols for office equipment is being prepared.

International Telecommunication Union (ITU)


Graphical symbols to assist users of telephone services are published in ITU-T
Recommendation E.121. This document includes symbols for videotelephony which have
been developed by The European Telecommunication Standards Institute (ETSI) using the
Multiple Index Approach which is described in ETSI Technical Report ETR 070 (1993).

Use of graphical symbols for consumer products


Philips is a multinational company which produces a wide range of consumer products. As
most of the products are electrotechnical, IEC 60417 is the major source of symbols. Figure 1
shows examples of symbols from IEC 60417 adapted for use within Philips. Many of the
symbols are also used as icons.

Figure 1. Some graphical symbols used by Philips

A major concern of Philips Design is the usability of products and this is taken into
account in the policy with regard to the use of graphical symbols and the development of new
symbols. Some of the practical issues arising are as follows:

Symbols or text
In general, text describing a function in the user’s native language is likely to be understood
better than a graphical symbol. Where there is no compelling reason to use graphical symbols it
is better to use text. However, graphical symbols are useful where there is insufficient space to
use text, eg. on a remote control, or where use of the equipment must be language independent.

Pictogram or abstract symbol


The symbols on the left of Figure 1 are pictograms, i.e. they depict an object. Those on the
right are abstract symbols. It might be thought that the pictograms can be more easily
understood than the abstract symbols. The meaning of abstract symbols has to be learned but
some are well understood, eg. the arrows used for “play”, “fast forward” etc. on tape players.
Furthermore a pictorial representation may limit the application of the symbol or may
become out of step with developments in technology.

Approval of new symbols


In a company as diverse as Philips with relatively independent lines of business operating in
all parts of the world, it is necessary to control the use of graphical symbols. This is done in
10 FR Brigham

the consumer products area by means of a company standard containing “approved” symbols.
Where no appropriate symbol can be found either in the company standard or in relevant
international standards, a new symbol must be developed. The Philips internal procedure
involves submitting the proposed new symbol for immediate comment by an expert panel
followed by circulation to a wider group of interested parties prior to publication in the
standard. Wherever possible, new symbols are tested as part of the user interface.

Accessing the information


The printed company standard for symbols is in the process of being replaced by a web site on
the Philips intranet. This has the advantage that the up-to-date collection of approved symbols
and new proposals under consideration can be accessed from anywhere in the world. The database
can be searched using keywords and, when appropriate symbols have been found, the drawings
can be downloaded as electronic files for immediate use. The approval procedure for new symbols
has also been made more effective and efficient by linking this to the web site.

Testing graphical symbols


It has been noted above that symbols should be tested wherever possible to ensure that their meaning
will be understood. The following example illustrates some of the issues involved in developing
symbols which can be reliably comprehended and distinguished. The testing of symbols is part
of the Philips quality policy for consumer products which is committed to putting in place processes
and tools which enable quality of use to be managed in a systematic way.

Symbols for timer functions


In response to the needs of the electrotechnical industry, an IEC proposal was circulated
containing symbols for the time functions “elapsed time”, “remaining time”, “programmable
start” and “programmable stop”. The symbols concerned are shown as Set 1 in Table 1.
Although the functions may initially appear complex, they are all functions which may be
found on consumer equipment in the living room, bedroom or kitchen. Philips was concerned
about the comprehensibility of the proposed symbols and decided to test them.

Testing procedure
The test of pictogram associativeness from the ETSI Multiple Index Approach, ETR 070
(1993), provided the basis for the test method. The procedure was as follows:

• One of the functions was described and drawings for all four functions presented. Subjects
were asked to choose the drawing which best represents the function.
• This was repeated for all four functions with the order randomised between subjects and
stressing that each choice should be independent.
• Subjects were typical users of consumer equipment (N≥24 for full test).

Two further sets of symbols were tested at separate times. Set 2 was used in a pilot test
conducted by Philips to explore possible alternatives to Set 1. Set 3 was circulated in the most
recent IEC draft.

Results
The number of correct selections of the symbols (hit rate) is shown in Table 1. The figure in
brackets is the percentage of subjects correctly selecting the symbol concerned. Where the hit
Standardisation of graphical symbols for consumer products 11

ate differs significantly from that expected by chance (25%), this is indicated by the asterisks.
(Chi-square goodness of fit test, **=α<.01, ***=α<.001. Note: the expected frequencies for
Set 2 are less than 5.)

Table 1. Results of testing the timer symbols

The results for Set 1 indicate that the symbols for elapsed time and remaining time were
correctly selected significantly less than would be expected by chance. These are not just bad
symbols. In combination with the other two symbols they are actively misleading.
Set 2 shows a large improvement in the discriminability of the symbols. The pair of symbols
proposed for elapsed/remaining time work well and can be distinguished from each other and
from the other pair. However, they were considered unsuitable for the subsequent draft because
they could potentially refer to any elapsed and remaining quantity, and not necessarily to time.
Set 3 appeared in the most recent IEC draft and the symbols are likely to be approved for
publication (with minor graphical enhancements).

Discussion
Had it not been for the intervention of Philips, the symbols in Set 1 might have been
published by the IEC. On the basis of the test results it was possible to analyse why the
results were so poor. The challenge was to provide clear graphical cues which distinguish the
two pairs of related symbols (elapsed/remaining time and programmable start/stop) and also
distinguish the symbols within each pair. The graphical features in Set 3 (single versus double
clock hands, large arrow heads, large dots) are all intended to enhance the distinctions and the
result show the positive effects.
It might be thought that much higher hit rates should be achieved but it should be borne in
mind that this is an extremely demanding test in view of the fact that the functions are so
similar. If one of the symbols had been tested with three completely different symbols, the hit
rate might have been 100%. This reflects a difficulty with the Multiple Index Approach, i.e.
12 FR Brigham

the individual results are entirely dependent on the other symbols in the set which is tested.
This means that the set chosen must have ecological (i.e. contextual) validity and the results
cannot be extrapolated to other situations.
A further important issue is learning. The test procedure can be repeated to reveal learning
effects but normal practice is to focus on the first time comprehensibility or discriminability
of the symbols. In practice, users may come across the same symbols repeatedly and
subsequent recognition may therefore be more important.

Designing effective symbols


It is not possible to provide detailed guidance on how to design effective symbols within the
scope of this paper but a number of important issues will be mentioned.
Firstly it is essential to focus the communicative processes involved in the understanding of
graphical symbols. Barnard and Marcel (1984) provide an excellent exposition of this approach.
It is important not to focus on the representation of objects for their own sake but to provide
semantic elements which communicate the appropriate message resulting in the appropriate
behavioural response of the user. The two symbol shown in Figure 2 illustrate the point:

Figure 2. Alternative proposals, “Boiler empty” (left) and Fill boiler” (right)

A set of symbols was required for a domestic iron with a separate water container (boiler).
The “Boiler empty” symbol was designed to complement several other symbols which
depicted the boiler. However, an analysis of the message to be communicated indicated that
the purpose of this symbol was to prompt the user to fill the boiler. This understanding led to
the design of the alternative symbol “Fill boiler” which gives a clearer message to the user.
The results of testing the timer symbols highlight the need to consider carefully how
related groups of symbols can be discriminated from other groups, and the how the individual
symbols within the groups can be discriminated from each other. The separate needs of
designing for first time comprehension and later recognition also need to be addressed by the
appropriate use of graphical and semantic elements.
More detailed guidelines, including guidelines for the management of the development
process, can be found in Horton (1994).

References
Barnard, P. and Marcel, T. 1984, Representation and understanding in the use of symbols
and pictograms. In Easterby, R.S. and Zwaga, H.J.G. (eds) Information Design 1984,
(John Wiley and Sons, Colchester), 37–75
ETSI Technical Report ETR 070, June 1993, The Multiple Index Approach (MIA) for the
evaluation of pictograms (European Telecommunication Standards Institute, Sophia
Antipolis)
Horton, W.K. 1994, The Icon Book: Visual Symbols for Computer Systems and
Documentation (John Wiley & Sons, New York)
THE UK HUMAN FACTORS DEFENCE STANDARD:
PAST, PRESENT AND FUTURE

Roger S Harvey

Systems Psychology Group


Centre for Human Sciences
DERA Farnborough
Hants GU14 6TD

This provides a historical review of the development of the United Kingdom


Defence Human Factors Standard, DBF STAN 00–25. This can trace its
origins to the early 1960’s when as a joint Royal Naval Personnel Research
Committee/Medical Research Council (RNPRC/MRC) handbook for
designers, the text was intended for use within the Royal Navy and to be a
summary of data and guidelines. Further developments of this early venture
led to what is now known as DEF STAN 00–25, with tri-Service
applications, and initially with a list of Parts which deliberately mirrored
those of the RNPRC/MRC handbook. At the end of 1997 the text of this and
many other DEF STANs was made available on the Internet thus making the
knowledge widely available in digital form.

Early beginnings

The Navy Handbook


From the Second World War onwards the UK Ministry of Defence has attached significance
to the role that ergonomics can play in the evolution of equipment matched to the cognitive
and physical demands of users and maintainers. In recent years substantial programmes and
methodologies such as MANPRINT (initiated within the USA) and Human Factors
Integration (HFI, a UK programme derived in part from the core of MANPRINT
methodologies and activities) have provided the necessary impetus and encouragement for
system designers and engineers. However during the first three decades after the Second
World War the role played by a number of enthusiastic, committed, psychologists and human
factors scientists was crucial in building upon the foundations provided by the then MOD
Research Establishments such as the Clothing Equipment and Physiological Research
Establishment (CEPRE), the RAF Institute of Aviation Medicine (RAF IAM), and the Royal
Aircraft Establishment (RAE). These Establishments, other smaller units, and their
successors such as the Army Personnel Research Establishment (APRE), have provided a
continuing commitment to ergonomics and human factors culminating in the amalgamation,
during the last two years, of many of their constituent elements to form the Centre for
14 RS Harvey

Human Sciences (CHS) as a sector of the Defence Evaluation and Research Agency
(DERA).
In UK MOD headquarters during the 1960s and 1970s a small number of psychologists,
including Hugh Stockbridge, Edward Elliot and Ken Corkindale, provided the necessary
leadership and encouragement within the lengthy corridors of Old Admiralty Building and
Main Building, so as to secure continual funding and impetus for a number of the elements of
those early units and establishments. The combination of MOD (intra-mural) research, and
extra-mural funding for universities, for example, ensured the continuation of the British
heritage of the application of ergonomics within defence equipment design.
Hand in hand with this background of activity it was clear that the provision of clearly
written technical documentation would add the necessary human factors data and guidance
that designers and engineers needed for their tasks within Industry. In the 1960’s Hugh
Stockbridge had moved from the wooden huts of CEPRE Farnborough to the corridors of the
Old Admiralty Establishment in Spring Gardens (just off Whitehall) and it was here that a
Handbook took shape that was to provide the solid footing for the document which would
eventually become the United Kingdom equivalent of US MIL-STD 1472, namely DEF
STAN 00–25.
Under the auspices of the Royal Naval Personnel Research Committee, and in
collaboration with staff of the Medical Research Council, the “Royal Navy Handbook for
Designers” was produced during the 1960’s. This took the form of a plastic ring-binder (in an
appropriate shade of blue and smartly emblazoned with gold lettering on the cover)
containing 12 typewritten chapters, each one devoted to a self-contained topic of human
factors and ergonomics data and design guidance. The Handbook was an early success,
although it has to be said that the landscape layout of the pages meant that it was considered
by some to have “difficult handling characteristics”.
The emphasis in the Handbook was upon easily readable data, and checklists of actions.
Intended in the first instance for designers and engineers who lacked specialist human factors
knowledge, it was successful enough to find its way onto the shelves of many psychologists
and ergonomists who found it invaluable as technical summaries and several copies still exist
at CHS and other establishments.
However, the early success of this venture was temporarily cut short a few years later
when funding for a projected second edition could not be found. Paradoxically this was to
prove a blessing in disguise, because it provided the necessary impetus for the Senior
Psychologist (Naval) to gather together a number of scientists from all three Service human
factors units and Establishments in order to lay the groundwork for a future tri-Service
document. With the encouragement of Edward Elliott and Ken Corkindale, Hugh Stockbridge
became associated with the newly formed editorial planning committee and, since all defence
committees had to have a name, suggested its imaginatively inappropriate acronym SCOTSH
(Steering Committee for the Tri-Service Handbook).
At an early stage it was agreed that the original Navy Handbook would become the model
for the contents of the prospective Tri-Service equivalent, and MOD Directorate of
Standardization confirmed that the final document would form a new Defence Standard, DEF
STAN 00–25 Human Factors for Designers of Equipment.
The UK human factors defence standard 15

The Present

The development of DEF STAN 00–25


And so the scene was set for the conversion of the Navy Handbook into what is now the
somewhat more garish orange covered Directorate of Standardization—sponsored DBF
STAN 00–25 booklets, covering a substantially greater range of technical information than
the original Handbook. Over the next 15 years Hugh’s dream of a stand-alone Defence
Standard devoted to human factors and ergonomics data and design guidelines was gradually
implemented, culminating in the publication of the last of 12 Parts of DBF STAN 00–25 with
a structure deliberately based on the original Navy Handbook.
In the late 1970s the author was invited to join one the editorial subcommittee of SCOTSH
chaired by Dr Maurice Elwood of APRE, and later succeeded him in this position. This
subcommittee then took over all responsibility for commissioning and editing of Parts, and later
formed up as Directorate of Standardization Committee with a descriptor which can only be
described as prosaic (El8) when viewed alongside Hugh’s imaginative SCOTSH. Over the
ensuing years Hugh’s plan was slowly put into place and for those who are unfamiliar with the
contents they are given in Table 1 below, together with the more recent part 13.

Table 1. DBF STAN 00–25 Current Contents

In the early days authors were drawn almost exclusively from University Departments, for
example the late John Spencer was the author of the first edition of Parts 6 and 7. In later
years a number of companies responded to invitations to tender for authorship; BAe Sowerby
Research Centre provided the authors of Parts 10 and 12, whilst System Concepts Ltd
provided the authors for the revised editions of Part 6 and Part 7.
The protracted gestation period of some 10 years for all Parts was due principally to the
limited availability of funding in each FYs. However there were (fortunately) rare instances
when an early draft did not quite seem to “gell” and substantial re-writing delayed the
original timetables. The original draft of Part 5 (Stresses and Hazards) was one such draft
which underwent significant editorial changes, including one which caused a certain amount
of “black humour” at the time. Within the section on the effects of nuclear detonation on the
human body, it was noted that the author had stated that the most serious effect of such
16 RS Harvey

detonation was blindness. Regrettably there seemed to be no mention in any part of the text of
the effects of radiation, nor of the likely deaths to be caused by explosion!
Few Parts were without their last minute panic at the final proof stage. For Part 7 (Visual
Displays) it was the discovery that a diagram of a submarine display was upside down, with
the result which seemed to have the vessel diving when it should have been surfacing, and
vice versa. For Part 10 (Controls) it was the description of a control which was so garbled that
Hugh Stockbridge felt that it was more appropriate for an Zulu spear; the original text had
described a particular control as “a ball-shaped knob-knob”.
Each Part was intended to act as an up-to-date sourcebook of data and guidance for
designers of defence materiel. Then, as now, three guiding principles governed the authors.
Firstly, the Parts were NOT intended to be voluminous textbooks in their own right—this was
undoubtedly the most difficult principle to adhere to and was the cause of much grief at the
editorial stages of several Parts! Secondly, particular care had to be taken to ensure the
applicability of each Part to all three Services. Finally, the primary readership was designers
representing a wide spectrum of technical background and knowledge, but who were NOT
assumed to be human factors specialists.
The use of this Standard has now become common within the defence sector, but it finds
increasing application within the civil sector of Industry both here and in the UK and
elsewhere. When compared with MIL-STD 1472E, for example, there is clearly some
duplication of approach although naturally there is substantial tailoring of local markets.
However it is interesting to note that increasing numbers of requests for DEF STAN 00–25
now come from the USA. Recent requestors have included the US Coastguard, and Ford
Motor Co Research Centre, Michigan.
Until very recently the text of DEF STAN 00–25 has been published exclusively via the
traditional medium of paper, although it is interesting to note that some 10 years MOD gave
approval to two defence contractors to produce hypertext versions for internal Company use.
Undoubtedly the future lies in other media as the final section of this paper indicates.

The digital future


During the last 5 years Assistant Director/Standardization (AD/Stan) has been examining the
feasibility of making Defence Standards available digitally. A number of Defence Standards
have been available on CD-ROM for some time, but the exponential growth in the application
and use of the Internet has forced a recent examination of what would have been a decidedly
novel means of dissemination to the early authors and editors of the Naval Handbook. During
1996 plans were made to launch UK Defence Standards on the Internet, through the medium
of the AD/Stan pages on the World Wide Web. These plans have recently been confirmed and
it is hoped that UK Defence Standards will be accessible on the Internet shortly after
December 1997.

Dedication
This short paper was written as a contribution to a session at the Ergonomics Society 1998
Annual Conference held in memory of the late Hugh Stockbridge, Honorary Fellow of the
Ergonomics Society.
The UK human factors defence standard 17

Acknowledgement
The author would like to thank his colleague Ian Andrew, and Donald Anderson, for their
initial encouragement to submit this paper, and for their helpful memories contributed during
the drafting process. Any errors remain those of the author.

© Crown Copyright 1998


STEPHEN PHEASANT
MEMORIAL SESSION:

THE CONTRIBUTION
OF ERGONOMICS TO
THE UNDERSTANDING
AND PREVENTION OF
MUSCULOSKELETAL
DISORDERS
STEPHEN PHEASANT (1949–1996)
MEMORIAL LECTURES

The Contribution of Ergonomics to the Understanding and Prevention of


Musculoskeletal Disorders

Sheila Lee David Stubbs


7 Antrim Grove Centre for Health Ergonomics
London NW3 4XP EIHMS, University of Surrey
Guildford, Surrey GU2 5XH

INTRODUCTION
As a scientist Stephen Pheasant believed in the common sense of ergonomics. He cared
passionately about the design of work and about the physical comfort, psychological symbiosis,
job satisfaction of the working man/woman. He believed that the spirit of the worker should be
reinforced by work rather than depleted by it, as he had so often found to be the case.

Stephen was a ‘hands on’ ergonomist. In addition to employing the utmost academic rigour,
he returned time and again to the dissecting room, where he would examine specimens at first
hand to try and discover where the connections may lie between the working actions and the
resulting injury. One of his common phrases was ‘God is in the details’. His profound
knowledge of anatomy led him to research and explore the anatomical—biomechanical—
physiological details by which ergonomic injuries occurred. He would never tire of
discussing the subject with colleagues, clinicians and students.

The demands of consumerism concerned him. He was concerned that high production methods
frequently caused musculoskeletal injury and psychological stress. He was concerned at the
extent of old fashioned authoritarian styles of management. But what concerned and upset him
most was (to use his own words) that when people became injured they should be treated like
old pieces of broken factory or office machinery, and simply dismissed.

The importance that Stephen attached to the topics of musculoskeletal disorder, is reflected in
the following papers which address the contribution of ergonomics to our understanding and
prevention of these conditions. They emphasise both physical and psychosocial work factors,
In particular the papers highlight the importance of combinations of exposures to both sets of
factors in the manifestation of musculoskeletal disorders. A series of challenges are presented
for ergonomics which are considered a suitable addition to Stephen’s legacy. They also serve
as a testament to Stephen who will always be remembered with affection, respect, regard and
for many as an inspiring friend.
THE CONTRIBUTION OF ERGONOMICS TO THE
UNDERSTANDING AND PREVENTION OF
MUSCULOSKELETAL DISORDERS: THE ROLE OF
PHYSICAL ASPECTS

Peter Buckle

Reader
Robens Centre for Health Ergonomics
EIHMS, University of Surrey
Guildford, Surrey, GU2 5XH

The vast contemporary research literature reflects current concern over these
disorders. As this research has become available it has become apparent that
simple theories of “trauma” to tissues, followed by pain and then recovery
do not adequately explain the observed facts. Similarly, the relationship
between so-called “physical” factors and the development of, in particular,
back and upper limb disorders, is complex and does not allow a simple
prediction of the effects that changes in physical exposure will have on
musculoskeletal outcomes. The failure to “explain” the origins and aetiology
of these disorders through an examination of physical factors has led to an
exploration of psychological, organisational and sociological factors. This
research has also proved, so far, to be of only limited help in explaining the
phenomena. It can be argued that, through an increasing focus on these
elements, the relative importance of physical factors has been diminished.
The epidemiological evidence in support of work related, and particularly
physical factors and combinations of physical factors, in the development of
a number of musculoskeletal disorders is nevertheless, strong. It can be
argued that ergonomic interventions that reduce such exposures may still
prove to be the most effective means of reducing the prevalence of these
disorders.

Introduction
The role of physical and psychosocial factors in the development and expression of
musculoskeletal disorders has been well documented, with physical factors being identified
in many early documents and texts. The increasing contemporary research literature reflects
the concern over these disorders. As this research has become available then so it has become
apparent that simple theories of “trauma” to tissues, followed by pain and then recovery do
not adequately explain the observed facts (e.g. (Armstrong et al, 1993.)
22 PW Buckle

Similarly, the relationship between so-called “physical” factors and the development of, in
particular, back and upper limb disorders, is complex and does not allow a simple prediction
of the effects that changes in physical exposure will have on musculoskeletal outcomes.
The failure to “explain” the origins and aetiology of these disorders through an
examination of physical factors has led to an exploration of psychological, organisational and
sociological factors (e.g. Bongers et al, 1993, 1995; Sauter and Swanson, 1996; Buckle, 1997
a). This research has also proved, so far, to be of only limited help in explaining the
phenomena.

Epidemiological Evidence
It can be argued that, through an increasing focus on these elements, the relative importance
of physical factors has been diminished. The epidemiological evidence in support of work
related, and particularly physical factors and combinations of physical factors, in the
development of a number of musculoskeletal disorders is strong (NIOSH, 1997, Buckle
1997b).
The relative importance of these factors are, selectively, presented in Table 2 (see NIOSH,
1997). From such an examination, it can be argued that ergonomic interventions that reduce
such exposures may still prove to be the most effective means of reducing the prevalence of
these disorders.

Evidence from Intervention Studies


Ergonomists have consistently failed to provide sufficient evidence of the efficacy of work
system interventions. Whilst the major challenges are methodological, if they are to be
overcome then simplistic notions of physical exposure must be replaced with an
understanding of the components of physical exposure and their interactions.
Simple interventions are implemented routinely by many practitioner ergonomists. Few of
these appear to be evaluated with any degree of rigour and only rarely are control data
gathered to allow the relative effect of changes to be obtained. Thus it is not possible to state
whether interventions based on physical factors alone are sufficient or whether more complex
interventions are required.
Interventions at a societal level (e.g. European Union Directives) may be of only limited
benefit without greater evidence of the potential benefits. Thus a recent HSE review of the
Manual Handling at Work Directive showed that of the organisations surveyed only 30%
responded, of whom only 30% had heard of the directive. Thus it is possible that 90% of
organisations have probably made little headway on addressing these problems.

Conclusions
This paper re-iterates the need to utilise existing data on physical factors more effectively in
the workplace and to document the results. It does not underestimate the importance of other
factors (e.g. the psychosocial, individual) in the manifestation of these disorders, but suggests
that understanding the complex interactions between variables and within individuals is still a
distant goal.
The role of physical aspects 23

Table 1 Historical Perspective of Exposure and Back Pain

Table 2 NIOSH (1997) Summary Table


24 PW Buckle

References
Armstrong, T.J., Buckle, P.W., Fine, L.J., Hagberg, M., Jonsson, B., Kilbom, A., Kuorinka,
I., Silverstein, B.A., Sjogaard., Viikari-Juntura, E. 1993 A conceptual model for
work-related neck and upper-limb musculoskeletal disorders. Scand J Work Environ
Health 19:73–84
Bongers, P. M., de Winter, C.R., Kompier, M.A.J., & Hildebrandt, V.H. 1993, Psychosocial
factors at work and musculoskeletal disease, Scandinavian Journal of Work
Environment and Health, 19, 297–312
Buckle, P 1997a Upper limb disorders and work: the importance of physical and
psychosocial factors. Journal of Psychosomatic Research, 43, 1, 17–25
Buckle, P. 1997b Work related upper limb disorders. British Medical Journal, 315, 1360–3
HSE 1990 Work related upper limb disorders: a guide to prevention
Linton, S.J. 1990, Risk factors for neck and back pain in a working population in Sweden,
Work and Stress, 4, 41–49
NIOSH 1997 Musculoskeletal disorders and workplace factors DHHS (NIOSH) Publication
No. 97–141
Sauter, S.L. & Swanson, N.G. 1996. An ecological model of musculoskeletal disorders in
office work. In S.D.Moon & S.L.Sauter (eds.), Beyond Biomechanics: Psychosocial
Aspects of Musculoskeletal Disorders in Office Work (Taylor and Francis, London),
1–22
THE COMBINED EFFECTS OF PHYSICAL AND
PSYCHOSOCIAL WORK FACTORS

Jason Devereux

Research Fellow
Robens Centre for Health Ergonomics
EIHMS, University of Surrey
Guildford, Surrey, GU2 5XH

Physical and psychosocial work factors have been implicated in the complex
aetiology of musculoskeletal disorders. Psychosocial work factors differ from
individual psychological attributes in that they are individual subjective
perceptions of the organisation of work. An ergonomic epidemiological study
was undertaken to determine the impact of different combinations of physical
and psychosocial work risk factors upon the risk of musculoskeletal disorders.
Physical work factors are more important determinants of recurrent back and
hand/wrist problems than psychosocial work factors. The greatest risk of
musculoskeletal problems occurs when exposed to both physical and
psychosocial work risk factors. Ergonomic strategies should, therefore, aim
to reduce physical and psychosocial risk factors in the workplace.

Introduction
“We are fiercely competitive in our consumption of goods and services; and our sense of self-
worth is tied up in our use of status symbols. This lies at the root of our stress levels.”
(Pheasant, 1991)

Stephen Pheasant realised that humanity has created a ‘milieu’ of self-imposed stress by increasing
the demand for goods and for services. Satisfying the demand has been formalised into work
organisation goals, culture and beliefs but at what cost to the individual worker ironically from
which the demand has originated. The work organisation imposes physical and psychosocial
stressors upon the individual. The physical stressors originate from environmental, manual
handling and other physical demands and the psychosocial stressors originate from the perceptions
of the organisation and the way the work is organised. Models have been proposed that describe
the probable pathways by which these work factors can impose a threat such that symptoms,
signs and diagnosable pathologies of musculoskeletal disorders can ensue (Bongers et al, 1993;
Sauter and Swanson, 1996; Devereux, 1997a). The model by Devereux (1997a) proposed that
26 J Devereux

the individual perceptions of the work organisation e.g. the social support, the control afforded
by the work and the demands imposed may be influenced by the capacity to cope with such
psychosocial stressors. Individual capacity may be affected by a number of factors including
previous injury, cumulative exposure to work risk factors, age, recovery, and beliefs and attitudes
towards pain. Beliefs, attitudes and coping skills have collectively been referred to as psychosocial
factors by some (Burton, 1997), but to minimise confusion, they are referred to as individual
psychological attributes in this text.
The relationship between risk factors and musculoskeletal problems may be dependent on
the definition of the latter (Leboeuf-Yde et al, 1997) and also on the interrelationship
between risk factors (Evans et al, 1994). For example, many studies have simply considered
the effects of either physical or psychosocial work factors upon the risk of musculoskeletal
disorders. Some studies have considered both sets of factors but have assumed the
relationships between them to be independent when in reality such factors mutually exist.
The effect that the mutual existence of these risk factors has upon the risk of musculoskeletal
disorders has not been adequately investigated (Devereux, 1997a, Devereux, 1997b). An
ergonomic epidemiological investigation was conducted to examine the effects upon the
musculoskeletal system of physical and psychosocial work factors acting in different
combinations in a work organisation which employed workers engaged in manual handling,
driving and sedentary office work. The ethical permission for the cross-sectional study was
obtained from the University of Surrey Committee on Ethics.

Methods
Company work sites from around the U.K were randomly selected to participate in the study.
The mixed gender study population (N=1514) was given a self-report questionnaire that
included information on personal data and demographics, physical and psychosocial work
factors and musculoskeletal symptoms. Questions on physical and psychosocial work factors
had been validated elsewhere (Wiktorin et al, 1993; Hurrell and McLaney, 1988). Most of the
physical scales had a kappa coefficient greater than or equal to 0.4 (except for bent-over
posture and trunk rotation) and all the psychosocial scales had acceptable alpha coefficients
(0.65–0.95). The musculoskeletal symptom questionnaire included the head/neck, trunk and
upper and lower limbs. Items for the lower back had been validated against a physical
examination using a symptom classification scheme proposed by Nachemson and Andersson
(1982). The kappa values for the 7-day and 12-month prevalence were 0.65 and 0.69
respectively (Hildebrandt et al, 1998).
Physical and psychosocial work factors that had been shown in previous epidemiological
studies to increase the risk of back disorders by a factor of 2 or greater were selected to
classify individual workers into one of four physical/psychosocial exposure groups. For
physical exposure the criteria consisted of heavy frequent lifting or relatively lighter but
frequent lifting performed as well as driving. The physical exposures were quantified with
respect to a level or amplitude, a frequency or duration. For psychosocial work factors,
mental demands, job control and supervisor and co-worker social support were used to
classify workers into low and high exposure groups. Subjects not satisfying the low/high
physical and psychosocial criteria were excluded from the analysis. Recurrent back disorder
cases were defined as having experienced problems more than 3 times or longer than one
Combined effects of physical and psychosocial work factors 27

week in the previous year and were also present within the last 7 days at the time of the
survey. These back problems were not experienced before starting the present job. Univariate
Mantel-Haenzel chi-squared statistics were used to test the hypothesis of no association
between the exposure criteria variables and recurrent back disorders. Crude and logistical
regression analyses provided an estimate of the risks associated with exposure to different
physical/psychosocial work factor combinations. The potential confounding/modifier effects
of age, gender and cumulative exposure (defined as the number of years spent in the present
job) were controlled for in the logistical regression.

Results
There were 869 valid responses from the survey (57%) of the total study population (N=1514).
Non-respondents did not differ with respect to gender, age or cumulative exposure. Recurrent
back disorders were prevalent in 22% of the valid number of survey responses. Of the 869 valid
questionnaire responses, 638 workers were classified into low/high physical and psychosocial
exposure groups. The gender, age and cumulative exposure for the exposure stratified population
and the valid questionnaire population did not differ. The univariate analysis for recurrent back
problems showed that heavy frequent lifting>16 kg≥1–10 times per hour (p<O.OO1) and
relatively lighter but frequent lifting>6–15 kg≥1–10 times per hour performed as well as
driving=half the working day (p<0.001) were associated with recurrent back disorders. Forward
bent-over postures>60 degrees for greater than a quarter of the working day (p<0.05) was also
found to be significantly associated with recurrent back problems. Trunk rotation of 45 degrees
greater than a quarter of the working day increased the risk of experiencing recurrent back
disorders but was not statistically significant at the 5% level. A perceived high workload was
associated with recurrent back problems (p<0.05). Mental demands, job control, supervisor and
co-worker social support factors were not found to be statistically associated with recurrent
back disorders when considered independently.

Figure 1. The risk of recurrent back problems for different exposure groups
28 J Devereux

Figure 1 shows the combined risk effects of physical and psychosocial work risk factors
associated with recurrent back disorders. High exposure to physical work risk factors had a
greater impact upon the risk of recurrent back problems compared to psychosocial work risk
factors. For workers highly exposed to physical work risk factors and with relatively low
exposure to psychosocial work risk factors, the approximate risk of experiencing recurrent
back problems was approximately 3 times greater than for workers exposed by a lesser extent
to both physical and psychosocial work risk factors. The risk increased approximately 3.5
times for workers highly exposed to both physical and psychosocial work risk factors
compared to those exposed by a lesser extent to both sets of factors. A similar exposure-risk
relationship was observed for self-reported symptoms in the hands/wrists experienced both
within the last 7 days and the last 12 months at the time of the survey using the same exposure
criteria. The risk to the hands/wrists due to high exposure to physical and psychosocial work
risk factors was approximately seven times greater than being exposed to these risk factors by
a lesser extent (OR 6.94 95%CI 3.79–12.82). An exposure-risk relationship was not observed
for the same definition in the neck region. After controlling for the effects of age, gender and
cumulative exposure, a similar exposure-risk relationship was observed for each exposure
group and recurrent back disorders except for the low physical-high psychosocial exposure
group. The risk associated with this group was equal to unity.

Discussion
Exposure to a combination of psychosocial work risk factors seems to have a greater impact
on the level of risk compared to considering individual psychosocial work factors (Bongers
and Houtman, 1995). The greatest risks of experiencing musculoskeletal disorders was
derived from high exposure to both physical and psychosocial work risk factors but physical
work risk factors were more important determinants than psychosocial work risk factors for
recurrent back and hand/wrist disorders. A Swedish epidemiological study also showed that a
combination of heavy lifting and a poor psychosocial work environment increased the risk of
back pain and neck pain compared to being exposed to neither work factors (Linton, 1990).
However, the study design did not permit the analysis of the other possible exposure
combinations. In this study, associations with neck problems were not observed. Workers in
the low exposed groups performed tasks that were associated with neck and shoulder
disorders so they were not truly unexposed with respect to this anatomical region. As a result,
it could not be determined whether the combination of high exposure to physical and
psychosocial work risk factors increased the risk of neck disorders.
A cross-sectional study does not allow exposures to be measured before the onset of
musculoskeletal problems and so reporting biases may have been present for exposures and
self-reported symptoms. The influence of these biases was controlled by assessing current
exposures and assessing recently experienced symptoms. Self-reported exposures were also
tested using observation, instrumentation and interview methods and it was found that workers
could provide accurate reports for the exposure criteria (Devereux, 1997a). The cross-sectional
study was also limited at examining exposure-disease causation, but temporal data on the outcome
measure provided strong evidence that the exposures currently experienced were associated
with the development of back problems. The combination risk effects were not limited to the
back and have also been shown to be present for the hands/wrists.
Combined effects of physical and psychosocial work factors 29

Conclusions
The relationship between physical and psychosocial work risk factors is complex and is not
fully understood but reduction in exposure to both sets of factors is needed in risk prevention
strategies for musculoskeletal disorders. Ergonomic interventions should be targeted at the
organisation of work and the individual worker to reduce the psychosocial work stressors and
also the physical stressors. The consumption of goods and services remains unabated and will
be driven to higher levels in the years to come. Work organisations should strive to achieve a
balance between satisfying the demand and maintaining a healthier workforce.

References
Bongers, P.M., de Winter, C.R., Kompier, M.A.J., & Hildebrandt, V.H. 1993, Psychosocial
factors at work and musculoskeletal disease, Scandinavian Journal of Work
Environment and Health, 19, 297–312
Bongers, P.M. and Houtman, I.L.D. 1995, Psychosocial aspects of musculoskeletal
disorders. Book of Abstracts, Proceedings of the Prevention of Musculoskeletal
Disorders Conference (PREMUS 95), 24–28 September, Montreal, Canada, (IRSST,
Canada), 25–29
Burton, A.K. 1997, Spine update—Back injury and work loss: Biomechanical and
psychosocial influences, Spine, 22, 2575–2580
Devereux J.J. 1997a, A study of interactions between work risk factors and work related
musculoskeletal disorders, Ph.D. Thesis. University of Surrey
Devereux, J.J. 1997b, Back disorders and manual handling work—The meaning of
causation, The Column, 9, 14–15
Evans, G.W., Johansson, G., & Carrere, S. 1994, Psychosocial factors and the physical
environment: Inter-relations in the workplace. In C.L.Cooper & I.T.Robertson (eds.),
International review of industrial and organizational psychology, (John Wiley,
Chichester), 1–30
Hildebrandt, V.H., Bongers, P.M., Dul, J., Van Dijk, F.J.H., & Kemper, H.C.G. 1998,
Validity of self-reported musculoskeletal symptoms, Occupational and
Environmental Medicine, In press
Hurrell, J. & McLaney, M. 1988, Exposure to job stress-a new psychometric instrument,
Scandinavian Journal of Work Environment and Health, 14 Supplement 1, 27–28
Leboeuf-Yde, C., Lauritsen, J.M., & Lauritzen, T. 1997, Why has the search for causes of
low back pain largely been nonconclusive?, Spine, 22, 877–881
Linton, S.J. 1990, Risk factors for neck and back pain in a working population in Sweden,
Work and Stress, 4, 41–49
Nachemson, A.L. & Andersson, G.B.J. 1982, Classification of low-back pain, Scandinavian
Journal of Work Environment and Health, 8, 134–136
Pheasant, S. 1991, Ergonomics, Work and Health (Macmillan Press, London)
Sauter, S.L. & Swanson, N.G. (1996). An ecological model of musculoskeletal disorders in
office work. In S.D.Moon & S.L.Sauter (eds.), Beyond Biomechanics: Psychosocial
Aspects of Musculoskeletal Disorders in Office Work (Taylor and Francis, London),
1–22
Wiktorin, C., Karlqvist, L., & Winkel, J. 1993, Validity of self-reported exposures to work
postures and manual materials handling, Scandinavian Journal of Work Environment
and Health. 19, 208–214
STEPHEN PHEASANT MEMORIAL SYMPOSIUM:
THE ROLE OF PSYCHOSOCIAL FACTORS

A Kim Burton

Spinal Research Unit, University of Huddersfield,


c/o 30 Queen Street, Huddersfield HD1 2SP

Low back trouble affects a majority of workers at some time in their


lives; many recover, but some become significantly disabled. The notion
of achieving primary control through ergonomic intervention, based on
biomechanics principles, whilst intuitively attractive, has so far
been unhelpful. Biomechanics/ergonomic considerations can sometimes
explain the first recalled onset of low back pain, but there is little evidence
that secondary control based solely on these principles will influence the
risk of the progression to chronic disability. More promising are
intervention programs that take account of the psychosocial influences
surrounding disability. Ergonomics can assist by ensuring that workplaces
are comfortable and accommodating, for both fit and back-troubled
workers.

Introduction
That low back trouble (LBT) is an increasing problem in industrialised society is axiomatic,
despite the efforts of ergonomists, clinicians and legislators. There is a dichotomy:
ergonomists and biomechanists strive to reduce physical stress at the workplace with the
intent of lowering the risk of musculoskeletal problems, yet clinicians and psychologists are
suggesting that rehabilitation of the back-injured worker should involve not only activity, but
physical challenges to the musculoskeletal system.

Background
Epidemiology may point to a link between physically hard work or whole body vibration and
back pain, but this link is not universal; seemingly much depends on definitions for back pain
and workload. Similarly, reports of an association between heavy work and absenteeism are
not entirely consistent. However, experimental biomechanical evidence does suggest that
strenuous work is likely to be detrimental; in vitro experiments, simulating physiological
occupational loads, can result in fatigue damage to numerous spinal tissues (Adams and
Dolan, 1995; Brinckmann et al. 1988). Thus, a reduction of occupational loads should limit
Role of psychosocial factors 31

work related back trouble, but despite the gradual reduction in occupational physical stressors
back pain has not decreased; in fact, disability due to LBT has increased.

Back pain can be as prevalent among sedentary workers as among manual workers but heavy
jobs do seem to be associated with an increased work loss. Work-related LBT should be viewed
against the high background level of reporting of a symptom which has an undetermined
pathology, a propensity for recurrence and a variable tendency to progress to disability. The
identification of risk factors is problematic, and it is difficult to be certain that a particular job is
involved in causation. A consequence of the symptoms is often an inability (or reluctance) to
perform activities of daily living as well as work activities. The following discussion explores
some interrelationships between damage, symptom reporting and disability.

Spinal damage
The intervertebral disc is presumed at greatest risk of damage from physical stress. Disc
degeneration is influenced only modestly by work history; the greatest proportion of the explained
variation in degeneration can be accounted for by genetic influences, though age does have
some influence (Battié et al. 1995). When matching for age, sex, and work-related factors, disc
herniations were found in 76% of an asymptomatic control group compared with 96% in the
symptomatic group—the presence of symptoms was related to neural compromise and
psychosocial aspects of work, but not to the exposure to physical stressors (Boos et al. 1995). A
new method for quantifying overload damage from radiographs has enabled comparison of
cohorts exposed to heavy work with those exposed to light work; irreparable damage was
associated only with jobs entailing excesses of loading or vibration, suggesting that current
regulations are adequate protection against overload damage (Brinckmann et al. 1998).

Other structures may also sustain damage. Deficient intrinsic spine muscles or a lack of
motor control may increase the risk of straining muscles or ligaments (Cholewicki and
McGill, 1996), but recovery should be fairly rapid.

Irrespective of whether damage to spinal structures can be identified or quantified, there is no


doubt that workers do get painful backs and some will believe it is their work which is to
blame. A study, of sick-listed blue-collar workers found that 60% of patients believed that
work demands had caused their back trouble, but neither an assessment of workload (e.g.
lifting, bending) nor calculated compression loads predicted the rate of return to work or sick-
leave during follow-up (Lindstrom et al. 1994).

Injury, recurrence and work loss


Some data suggest that the risk of LBT is associated with the dynamics of lifting, and one
study has linked epidemiological findings with quantitative biomechanical findings, though
causation was not established (Marras et al. 1993). Experienced industrial workers seemingly
have a reduced risk for LBT compared with inexperienced workers, but this may be related
more to muscular coordination aiding spinal stability, rather than lowered spinal loads
(Granata et al. 1996).
32 AK Burton

Workers in similarly demanding occupations can have varying symptomatology. A study


of nurses in Belgium and The Netherlands has shown a significantly lower prevalence of back
trouble (and other musculoskeletal complaints) in the Dutch nurses despite the fact that their
average workload was substantially greater than then-Belgian counterparts. Overall,
symptoms and work loss were not related to work-load. The Dutch nurses differed strikingly
on a range of psychosocial variables; they were less depressed and significantly more positive
about pain, work and activity (Burton et al. 1997).

A large general population study (Croft et al. 1995) has found that new episodes of LBT are
more likely for those who are psychologically distressed, even for first onsets. An industrial
study (Bigos et al. 1991) found that reported first injuries were not related specifically to job
demands, rather to psychosocial factors such as low job satisfaction. Police officers in
Northern Ireland have proved useful for studying first-onset LBT; they compulsorily wear
body armour weighting >8kg. Compared with an English police force without body armour,
they showed reduced survival time to first-onset. It was also found that working in vehicles
comprised a separate risk; but the effect of exposure to armour and vehicles was not additive.
The proportion of officers with persistent (chronic) back complaints did not depend on the
length of exposure since first-onset, rather chronicity was associated with psychosocial
factors (distress and blaming work) (Burton et al. 1996). There is little support for a
relationship between recurrence and work demands. The best predictor of future trouble
seems to be a previous history, with perception of work demands being more important than
objective measurement (Troup et al. 1987), and dissatisfaction with work being a significant
factor. The term re-injury may be a misnomer (Bigos et al. 1991).

Workers with current LBT have been shown to have a lower score for job satisfaction and
social support but, surprisingly, absenteeism and work heaviness were not related to these
parameters (Symonds et al. 1996; Burton et al. 1997). But other attitudes and beliefs do seem
to be relevant. Psychosocial factors such as negative beliefs about the inevitable
consequences of LBT, inadequate pain control strategies, fear-avoidance beliefs and belief
that work was causative have all been found to relate to absenteeism. The relationship
between attribution of cause, job satisfaction and pain perception is complex, but a simple
educational intervention program (comprising workplace broadcasting of a pamphlet
stressing the benign nature of LBT, the importance of activity and desirability of early work
return) is capable of creating a positive shift in beliefs with a concomitant reduction in
extended absence (Symonds et al. 1995). There is accumulating evidence that early return to
the same task is beneficial and does not highten the risk recurrence of symptoms (or do
further damage). A three-year follow-up of occupational musculoskeletal injuries (including
LBT) found that those whose workloads had been reduced did not report less problems
(Kemmlert et al. 1993). In fact, a successful rehabilitation program for patients with
subchronic back pain has advocated early return to unrestricted duties as part of a combined
graded activity/behavioural therapy approach (Lindstrom et al. 1992). Clinical studies in
workers’ compensation back pain patients have found that delayed functional recovery was
associated with psychosocial factors more than with perceived task demand (Hadler et al.
1995), and that longer spells off work were associated with a poor outcome (Lancourt and
Kettelhut, 1992).
Role of psychosocial factors 33

The reluctance to confront normal physical challenges seen in back-disabled workers has
been termed activity intolerance, which is variously linked to individual response to pain, the
belief that a specific injury must be the cause of the pain, and the behavioural roles such as
suffering. The question obviously arises as to the origin of the various relevant psychosocial
traits. There is clinical evidence that psychological profiles predictive of chronicity are
present very early in the course of the back pain experience (Burton et al. 1995); seemingly
they are not a result prolonged pain.

Effectiveness of ergonomic intervention


Supporttive evidence for the belief that ergonomic intervention will reduce the impact of
occupational low back pain is not compelling. The only intervention which has been formally
evaluated is worker training in manual handling techniques; whilst lifting techniques can be
improved, the effect on injury rates has not been clearly demonstrated (Smedley and Coggon,
1994). A recent rigorous evaluation of a ‘back school’ approach to injury prevention found
that the programme did not reduce the rate of injury, time off work, or rate of reinjury, even
though the subjects’ knowledge of safe behaviour was increased (Daltroy et al. 1997).

Summary
On balance, there is evidence to support the notion that biomechanics-based ergonomic
improvements to the workplace have some potential to limit first-time back injury; therefore
they should be deployed where practicable. The possible role of ergonomics for reducing
recurrence rates seems limited at best; conversely there is no convincing evidence that
continuance of work is detrimental in respect of disability. A proportion of workers with back
pain, having inappropriate beliefs about the nature of their problem and its relationship to
work, will develop fear-avoidance behaviours because of inadequate pain coping strategies;
they then begin to function in a disadvantageous way and drift into chronic disability. This
issue may best be challenged by a combination of organisational and psychosocial
interventions intended to make the workplace comfortable and accomodating (Hadler, 1997).

References
Adams, M.A. and Dolan, P. (1995) Recent advances in lumbar spinal mechanics and their
clinical significance. Clin Biomech 10, 3–19.
Battié, M.C., Videman, T., Gibbons, L., Fisher, L., Manninen, H. and Gill, K. (1995)
Determinants of lumbar disc degeneration: a study relating lifetime exposures and
MRI findings in identical twins. Spine 20, 2601–2612.
Bigos, S.J., Battié, M.C., Spengler, D.M., Fisher, L.D., Fordyce, W.E., Hansson, T.,
Nachemson, A.L. and Wortley, M.D. (1991) A prospective study of work perceptions
and psychosocial factors affecting the report of back injury . Spine 16, 1–6.
Boos, N., Reider, V., Schade, K., Spratt, N., Semmer, M. and Aebi, M. (1995) The
diagnostic accuracy of magnetic resonance imaging, work perception, and
psychosocial factors in identifying symptomatic disc herniations. Spine 20, 2613–
2625.
Brinckmann, P., Biggemann, M. and Hilweg, D. (1988) Fatigue fracture of human lumbar
vertebrae. Clin Biomech 3 (Suppl. 1), s1-s23.
34 AK Burton

Brinckmann, P., Frobin, W., Biggeman, M., Tillotson, M. and Burton, K. (1998)
Quantification of overload injuries to thoracolumbar vertebrae and discs in persons
exposed to heavy physical exertions or vibration at the work-place. Part II. Occurence
and magnitude of overload injury in exposed cohorts. Clin Biomech 13
(Supplement), (in press)
Burton, A.K., Tillotson, K.M., Main, C.J. and Hollis, S. (1995) Psychosocial predictors of
outcome in acute and subchronic low back trouble. Spine 20, 722–728.
Burton, A.K., Tillotson, K.M., Symonds, T.L., Burke, C. and Mathewson, T. (1996)
Occupational risk factors for the first-onset of low back trouble: a study of serving
police officers. Spine 21, 2612–2620.
Burton, A.K., Symonds, T.L., Zinzen, E., Tillotson, K.M., Caboor, D., Van Roy, P. and
Clarys, J.P. (1997) Is ergonomics intervention alone sufficient to limit
musculoskeletal problems in nurses? Occup Med 47, 25–32.
Cholewicki, J. and McGill, S.M. (1996) Mechanical stability of the in vivo lumbar spine:
implications for injury and chronic low back pain. Clin Biomech 11, 1–15.
Croft, P.R., Papageorgiou, A.C., Ferry, S., Thomas, E., Jayson, M.I.V. and Silman, A.J.
(1995) Psychologic distress and low back pain: evidence from a prospective study in
the general population. Spine 20, 2731–2737.
Daltroy, L.H., Iversen, M.D., Larson, M.G., Lew, R., Wright, E., Ryan, J., Zwerling, C.,
Fossel, A.H. and Liang, M.H. (1997) A controlled trial of an educational program to
prevent low back injuries. New England Journal Medicine 337, 322–328.
Granata, K.P., Marras, W.S. and Kirking, B. (1996) Influence of experience on lifting
kinematics and spinal loading, In: Anonymous 20th Annual Meeting, Georgia Tech,
Atlanta, USA: American Society of Biomechanics]
Hadler, N.M., Carey, T.S. and Garrett, J. (1995) The influence of indemnification by works’
compensation insurance on recovery from acute backache. Spine 20, 2710–2715.
Hadler, N.M. (1997) Workers with disabling back pain. New Eng J of Med 337, 341–343.
Kemmlert, K., Orelium-Dallner, M., Kilbom, A. and Gamberale, F. (1993) A three-year
followup of 195 reported occupational over-exertion injuries. Scand J Rehabil Med
25, 16–24.
Lancourt, J. and Kettelhut, M. (1992) Predicting return to work for lower back pain patients
receiving workers compensation. Spine 17, 629–640.
Lindstrom, I., Ohlund, C., Eek, C., Wallin, L., Peterson, L. and Nachemson, A. (1992)
Mobility strength and fitness after a graded activity program for patients with
subacute low back pain: A randomized prospective clinical study with a behavioral
therapy approach . Spine 17, 641–652.
Lindstrom, I., Ohlund, C. and Nachemson, A. (1994) Validity of patient reporting and
predictive value of industrial physical work demands. Spine 19, 888–893.
Marras, W.S., Lavender, S.A., Leurgens, S.E., Rajulu, S.L., Allread, W.G., Farthallah, F.A.
and Ferguson, S.A. (1993) The role of dynamic three-dimensional trunk motion in
occupationally-related low back disorders: The effects of workplace factors trunk
position and trunk motion characteristics on risk of injury. Spine 18, 617–628.
Smedley, J. and Coggon, D. (1994) Will the manual handling regulations reduce the
incidence of back disorders? Occup Med 44, 63–65.
Symonds, T.L., Burton, A.K., Tillotson, K.M. and Main, C.J. (1995) Absence resulting from
low back trouble can be reduced by psychosocial intervention at the work place.
Spine 20, 2738–2745.
Symonds, T.L., Burton, A.K., Tillotson, K.M. and Main, C.J. (1996) Do attitudes and
beliefs influence work loss due to low back trouble? Occup Med 46, 25–32.
Troup, J.D.G., Foreman, T.K., Baxter, C.E. and Brown, D. (1987) The perception of back
pain and the role of psychophysical tests of lifting capacity. Spine 12, 645–657.
MUSCULOSKELETAL
DISORDERS
INTERPRETING THE EXTENT OF
MUSCULOSKELETAL COMPLAINTS

Claire Dickinson

HSE, Magdalen House, Trinity Road,


Bootle, L20 3QZ.

A number of studies have compared the extent of musculoskeletal


complaints in the adult working populations using the Nordic
musculoskeletal questionnaire. The design of such studies has
typically involved cross-sectional cohorts of occupational groups
being used as referent populations to each other. The self-reported
complaints of aches, pain or discomfort relating to a particular body
area lead to a percentage being determined for those reporting
positively. The difficulty lies in then deciding when a particular
percentage indicates that there is reason for concern and action is
needed. The current paper proposes an interpretation system based
on the annual prevalence, discusses its limitations and offers
suggestions on its future development.

Introduction
The Nordic Musculoskeletal Questionnaire is a valuable tool enabling large scale
surveys into the extent of self-reported musculoskeletal complaints (Kuorinka et al,
1987). It has been extensively cited in the technical literature describing the state of
occupational populations (e.g. David and Buckle, 1997; Williams and Dickinson,
1997). Following a series of publications of referent data (Ydreberg and Kraftling,
1988), HSE embarked on an evaluation of the questionnaire and produced both
long and abridged standardised versions for their own use (Dickinson et al, 1992).
Whether using the original, HSE’s or a self-devised questionnaire, a similar form of
questions have been used to enquire about the extent of troubles such as aches,
pains, or discomfort in the last year (Annual Prevalence), the last week (Weekly
Prevalence) or that has prevented activity in the last year (Annual Disability). Users
have then produced a series of tables showing the percentage of that occupational
group reporting positive complaints for nine, defined, body areas. Many
researchers have designed their cross-sectional studies such that the data for the
occupational group of interest is compared to one or more control or referent
populations. This enables the findings to be interpreted in a wider context. In the
absence of selecting an appropriate control group, this approach produces little
Interpreting the extent of musculoskeletal complaints 37

more than a statement that population X reports more or less complaints in a


particular body area than populations A, B or C. However, what is usually needed at
this point is a form of interpretation and conclusions on the extent of the problem
among this occupational group, the part of the body which is particularly affected
and, if this is the case, what to do about it. At the current time, employers and
researchers seem to lack a basis for deciding on further action, apart from using
their personal judgement to establish a threshold (X%) for action.

Action Levels
With this in mind, annual prevalence figures from a number of HSE studies of self-
reported musculoskeletal complaints were reviewed to see if a simple interpretation
system could be devised. This could also be described as an attempt at defining
where priority should be allocated given limited resources. The data covered
workers operating 64 different systems of work. The occupations included trades or
different operating systems in the cotton, ceramic, food processing, construction
and garment manufacturing sectors aswell as production line assembly, packing
and supermarket cashiers. The data from each occupational group were considered
separately but in total included 1781 males and 4704 females. Given the size of
some populations that were studied, it was not feasible to group within an
occupation by separate age groups or any indices such as the length of time in a
given job.

Table 1 shows three action levels based on the annual prevalence data. The median
values plus an arbitrary 10% forms the row described as “high”. The median value
less 10%, forms the “medium” row. The “low” row is the percentage which fall
below the medium level.

Table 1. % Annual Prevalence Action Levels

Key: N—Neck, Sh—Right and left shoulders, E—Both elbows, WH—Wrist and
hands, UB—Upper Back, LB—Lower Back, H—Hips, K—Knees, A—Ankles.
38 C Dickinson

It is possible to compare newly collected data with the values shown in Table 1
and focus on those body areas where high levels are located in order to establish if
there is a problem and identify the body area affected. For example, if a population
study established that 50% of the females were self-reporting neck complaints in
the last year, then serious attention might to given to establishing why this may be.
For the same situation, a value of 41% would indicate that action was still merited
but with lesser urgency. Thirdly, if a “low” level was established, further action
should not be overlooked if accessible and straightforward but may be regarded of
lesser priority until ‘high’ and ‘medium’ situations had been tackled.

The 10% criterion is purely arbitrary. Alternatives might be suggested which reflect
the range of responses. However, when applying the current system to the original
64 systems of work, about 7 emerge in the ‘high’ action level and they do seem to
be the ones where subjectively improvements are thought to be particularly
merited.

Alternatives
The HSE’s work on supermarket cashiers has extended this approach slightly
further (Mackay et al, 1998). Tables in Mackay et al (1998) are presented showing
benchmarks for the annual prevalence, weekly prevalence and annual disability
based on the variation found amongst check-out operators operating different
systems of work. Two sets of figures are shown, depending on the size of the
population surveyed. It is anticipated they will serve to assist retail managers in
deciding whether their check-out operation are of an acceptable standard overall
and to prioritise areas for remedial action.

Wider Considerations
The advantages of the proposed interpretation system are many. The rational basis
for action reduces unnecessary costs and assists in prioritising resources. It also
provides of an opportunity to move forward with a proportional response. There are
however some limitations to the approach shown in Table 1. In particular, there was
no selection of specific occupational groups in an attempt to represent UK industry.
Their inclusion was simply based on the availability of data. Secondly, there is
the lack of age or exposure sensitivity. However, while these action levels
are tentatively suggested given the absence of other criteria, there is a clear
recognition that an improved criteria could be derived in the future and indeed, this
is desirable.

Self-reported questionnaires do not elicit what has caused the onset of complaints,
nor do they attribute causation entirely to working activity. Musculoskeletal
disorders are by their nature associated with a multi-factorial aetiology with factors
involved in their onset grouped as personal, clinical, organisational or concerned
with aspects of the work-space. Hence, a sensible, measured response to “high”
Interpreting the extent of musculoskeletal complaints 39

levels found in the workplace might include an ergonomic hazard spotting exercise
with further measurements to pin-point where the risk lies. This may take many
objective and subjective forms, such as the assessment of vibration levels,
psychosocial assessment, measurements of workload or equipment or activity
levels.

Recording the organisational system for dealing with reports of musculoskeletal


aches and pains in a workplace and the prevailing climate is often overlooked in
studies. Boocock et al (1997) have reported however that these factors have a strong
influence on the extent of reporting. In a situation where there are good employer-
employee relations, widescale positive self-reports of musculoskeletal complaints
may occur with the expectation that an active response will follow—irrespective of
the level of the observable risk. The opposite situation is more likely to be found. In
a climate of job uncertainty it may be perceived by respondents that a positive
report on a questionnaire may lead to a threat to their livelihood. Hence under-
reporting ensues.

Another confounder to be overcome is the high variability in the severity of the


symptoms reported by respondents. A complementary clinical examination to
determine grades of severity or a diagnosis is potentially useful here. Alternatively,
the use of an extended questionnaire which includes questions on severity, defined
as the effect on the person or the extent of disability (described as the prevention of
doing activities) would seem to be important in defining which cases are
experiencing the more costly forms of musculoskeletal disorder. Such questioning
might include:
Severity : have you seen a doctor or similar in a period of time or
the frequency or duration of experiencing symptoms
Disability : days or number of occasions absent from work or any
reduction in home or working activities.

Given such limitations in deriving a system for prioritising action it may be prudent
to extend any future system to incorporate severity and disability indices at the very
least.

Conclusion
This paper has presented one approach that can be used to interpret annual
prevalence data and used in prioritisation for remedial action. Whilst an
interpretation system based on annual prevalence has its uses, a system that
encompasses prevalence, severity and disability measures would seem to be more
acceptable in the long term. Supplementing the use of questionnaire surveys with
an ergonomic hazard spotting or comprehensive risk assessment provides an
opportunity to establish the extent of reporting and the nature of the improvements
required to manage the problem of concern.
40 C Dickinson

References
Boocock, M., 1997, Personal Communication re: Relative Risks Project
David, G. and Buckle, P., 1997, A questionnaire survey of the ergonomic
problems associated with pipettes and their usage with specific reference to
work-related upper limb disorders, Applied Ergonomics, 28, 4, 257–262.
Dickinson, C.E., Campion, K., Foster, A.F., Newman, S.J., O’Rouke, A.M.T. and
Thomas, P. 1992, Questionnaire development: an examination of the Nordic
Musculoskeletal Questionnaire, Applied Ergonomics, 23, 3, 197–201.
Kuorinka, I., Jonsson, B., Kilbom, A., Vinterberg, H., Biering-Sorenson, F.,
Anderson, G. and Jorgensen, K., 1987, Standardized Nordic Questionnaires
for the analysis of musculoskeletal symptoms, Applied Ergonomics, 18, 3,
233–237.
Mackay, C., Burton, K., Boocock, M., Tillotson, M., Dickinson, C.E., 1998,
Musculoskeletal Disorders In Supermarket Cashiers. HSE Research Report.
HSE Books.
Williams, N. and Dickinson, C.E., 1997, Musculoskeletal complaints in lock
assemblers, testers and inspectors, Occupational Medicine, 47, 8, 479–484.
Ydreberg, B and Kraftling, A., 1988, Referensdata Till Formularen FHV 001 D,
FHV 002 D, FHV 003 D, FHV 004 D och FHV 007 D. Rapport 6. The
Foundation for Occupational Health Research and Development. Orebro.

The views expressed are those of the author and not necessarily those of the
HSE.
PEOPLE IN PAIN

Donald Anderson

Centre for Occupational and Environmental Medicine,


The National Hospital, Pilestredet 32, N-0027 Oslo, Norway.

Many people at work suffer from some degree of musculoskeletal illness, but
these make up only a subset of the general population, of whom up to 86%
may complain about pain from this source. A survey of occupational health
centres was carried out in a health district in Norway to determine the
occurrance of musculoskeletal problems in the working population. A
questionnaire was sent out including questions about the number of patients
seen per week, the proportion diagnosed with a musculoskeletal problem,
how the diagnosis was made, and what proportion benefitted from treatment.
Other questions were asked about the type of work and possible causes for
the problem. The possibilities are mooted for general egonomics education
and training as a supplement to more conventional intervention.

Introduction
It is axiomatic that many people at work suffer from some degree of pain caused by
musculoskeletal illness, an affliction that seems now to be pandemic and numerous authors
have reported specific studies of selected parts of the working population. A few examples of
such studies include office workers (Grandjean, 1988) paediatric surgeons (Cowdery and
Graves 1997), bus drivers (Kompier, et al, 1987), and garage mechanics (Torp, 1997).
These occupational groups, however, represent only subsets of the population at large and
more researchers are now reporting surveys of the general population. Natvig, et al, (1995)
found that 86% from a general population survey in Norway had pain of musculoskeletal
origin in the previous year. Hagen, et al, (1997), in another Norwegian survey of 20,000
people (59% response), conclude that up to 61% had experienced musculoskeletal pain in the
previous month. Other studies have taken place in Sweden and Denmark with similar results.
Table 1 shows some of these data.
In some of these studies the results are based on self-reports of pain in the musculoskeletal
system, often using the Nordic questionnaire (Kuorinka, et al, 1987), or equivalent. Without
confirmation of diagnosis by clinical follow-up this may provide some level of over-
reporting, but nevertheless giving an indication of the scale of the problem. Of the two
general population studies, Natvig, et al, (1995) relied only on the self reports, but Hagen, et
42 DM Anderson

al, (1997) arranged clinical diagnosis of some 160 cases to identify inflammatory rheumatoid
arthritis as a separate group (ca. 7%).
Whatever the diagnosis, specific or not, and regardless of whether the condition is work-
related or not, a major problem of pain appears to exist, likely to be having a major impact on
the quality of life for sufferers at home and at work.

Table 1. Percentage prevalence of reported ms. pain

1: Grandjean (1988); 2: Cowdery and Graves (1997); 3: Kompier et al (1987); 4: Torp (1996); 5: Natvig et al.
(1995); 6: Hagen et al. (1997)

The Centre for Occupational and Environmental Medicine (SYM) at the National
Hospital in Oslo, has been set up for three years. While being a resource with responsibility
for an entire health district in Norway, covering approximately 437,000 people at work, little
information was available on what potential problems SYM might be expected to investigate
within the field of ergonomics, or the extent of the workload. To attempt to quantify potential
workload, and as a marketing exercise, therefore, SYM carried out a survey amongst 110
occupational health centres in the district.

Conduct of the survey


The aim of the survey was to determine by means of a questionnaire the extent of problems of
musculoskeletal illness in the working population, as identified by occupational health
professionals. The questionnaire, containing only ten questions (to encourage replies), was sent
out to 110 occupational health centres, ranging from district and community services to those
offered by large companies to their employees. A covering letter also explained the function of
SYM and the expertise available in the areas of medicine, occupational hygiene and ergonomics.
Questions asked were about the number of patients seen per week, the proportion diagnosed
with a musculoskeletal problem, how diagnosis was normally made (multiple choice), and what
proportion may have benefitted from treatment. Other questions were asked about the type of
work and possible causes for the problem, including workplace design (multiple choice) and
the possible influence of so-called psychosocial factors (multiple choice). There were obvious
limitations to the extent and reliability of the data obtained in this way, but it was intended
mainly to be of use in planning SYM’s strategy in relation to ergonomics.

Results
Conclusions have been drawn from analysis of 55 replies (50% response). These included 20
industry-based services, including paper and glass-making, biscuit manufacture, engineering,
People in pain 43

forestry and wood products and the police. Fourteen of our sample were public or community
based and 21 others included private physicians and occupational health clinics.
A few replies included comments about the questionnaire. Some respondents thought that
musculoskeletal illness should have been clearly defined; others had difficulty with the expression
‘patient’ in relation to company employees, and ‘cases’ may have been a better word. Some
were not convinced that their opinions about possible cause were valid, and others claimed not
to have adequate or sustained records. A few physicians explained that as company medical
officers they may diagnose, but treatment and follow-up were sometimes the province of a
patient’s general practitioner. It transpired that the attendant occupational medical officers were
mostly full-time, but some were performing a part-time occupational service in addition to their
private practice, and some centres were staffed only by physiotherapists. Some company-related
occupational health sevices were being phased out and ‘outsourced’.

Cases seen (patients) and diagnosed with musculo-skeletal problems


Most respondents saw less than 25 cases per week, and some saw between 25 and 35, whilst a
few saw as many as 45 or more per week. Of these, some diagnosed more than 50% of cases,
nearly half diagnosed 25–50% and a few diagnosed less than 25% of their cases as suffering
from musculoskeletal illness. More than half of respondents apparently relied only on
reported pain for their diagnosis, although many reported using joint mobility and/or other
unspecified tests to confirm diagnosis.

Improvement after treatment


Half of the respondents saw improvement in 25–50% of their cases after treatment, some less
than 25% and a few saw some improvement in more than 50% of their cases.

Work-related factors/non-work factors


About half the respondents considered heavy physical work was the prime cause of the
compaint, but only a few less felt that heavy mental work was just as important. Light mental
work was considered more likely than light physical work as a contributor. Nearly half of the
respondents considered football or handball to be a major contributor, with nearly as many
citing home decorating and carpentry. Some thought that home personal computer usage
(Internet?) was a likely cause

Workplace design
Office and VDU work and seating contributed the majority of workplaces, along with
construction machinery and benchwork (assembly/fitting). Some respondents were
themselves responsible for assessment of workplace design factors, with some others relying
on physiotherapists or similar professionals to carry out this work.

Psycho-social factors and influence


Nearly three quarters of the respondents considered problems of employment contributed to
patients’ illness; nearly half saw marriage and children problems as a contributory factor, but
only a few considered economic problems as important. More than a quarter saw a combination
of marriage and children and problems at work as important factors. A few saw all these factors
as important in combination. Fewer than 10% of respondents thought that such causes were
44 DM Anderson

influential in more than 50% of their cases, but more than half considered them to be so for
between 25 and 50% of cases. A few saw such factors as influential in less than 25% of cases.

Discussion and conclusions


The data collected lacks strict statistical validity, but a number of services sampled over a
cross-section of industry gave considered replies. The data confirms that a great many people
are diagnosed as suffering from musculo-skeletal problems. The number of cases seen on a
weekly basis suggests a scale of problem beyond available resources adequately to follow-up
and identify reliably any occupational or leisure basis for the complaints. To achieve accurate
diagnosis, clinical examination of the patient and follow-up inspection and assessment of the
job and workplace seems desirable in all cases, which would impose a very heavy load on the
occupational physician or other specialist colleagues.
Although some reported assessment of workplaces takes place, it is not clear just how much
intervention is being recommended, but between 25 and 50% of cases were successfully treated.
However, the study did not explore long term benefits, or the percentage of re-affliction or
chronic suffering. A wide variety of workplaces and tasks were implicated as possible causal
factors, with office and data work being predominant, contributing to heavy mental workload as
a cause of illness. Operating construction and other plant, and mechanical workshop activity
provided the basis for much of the heavy physical work reported as a cause of complaint. Non-
work activity included sport, home decorating and carpentry.
The impression was confirmed that so-called psycho-social factors are having an impact
on musculoskeletal problems, and that high among these are problems at work and the fear of
unemployment, as well as difficulties with marriage and childen. More surprisingly,
economic and housing seemed less important, but the influence of a combination of all these
factors is recognised by many of the respondents.
The results from this study reinforces the evidence from many surveys and detailed studies
that a major problem exists, and more than a suspician that resources to combat the problem are
inadequate. Faced with such evidence for ‘demand’ for ergonomics intervention, what can be
done? Do-it-yourself seems to be one alternative, armed with published tools like RULA from
McAtamney and Corlett (1992). This systematic workplace assessment method, coupled with
some basic rules for design/re-design will help to reduce the risks of contracting work-related
upper limb disorders. Other guidelines are also available. These intervention methods, however,
require dedicated application and time, but are effective for redesign of working situations,
where ergonomics may not have been involved from the start of the design (product or production).
Methods to ameliorate symptoms ‘on-line’ may also be effective, such as pausgymnastik,
adopted from Sweden and recommended by Pheasant (1991), and others. As yet unfinished
studies in Sweden and Norway appear also to show the benefits of exercise as both preventive
and remedial in the treatment of musculoskeletal problems. In the United Kingdom, the
recently formed Body Action Campaign (1997) is actively involved in remedial and
preventive work amongst schoolchildren, and this initiative suggests the possibility for more
widespread education and training.

What may be also be called ‘the Lothian initiative’ was launched in Edinburgh by Andrews
and Kornas (1982). These authors produced a programme of ‘Ergonomics Fundamentals for
People in pain 45

Senior Pupils’, including a workbook for teachers, intended to educate secondary school
students in basic ergonomics. Although the experiment was abortive at that time, evidence
may now be emerging that the programme could be effective in giving young people an
awareness of ergonomics, to help them to assess products and environments and react to poor
conditions in their working life. Hopefully, this will lead to a point where they can influence
their own working conditions.
Another programme is reported by Albers, et al, (1997), to teach apprentice carpenters in
the construction industry an awareness of ergonomics, with promising results. Over half the
apprentices completing the course reporting using the information they received and about
the same number saying that they changed the way they work following the course.
Other similar examples are beginning to appear in the literature which give encouraging
evidence for the notion that one way forward is through education, where by example and
diffusion, a slow process of general health improvement will result.

References
Albers, J.T., Li, Y., Lemasters, G., Sprague, S., Stinson, R. and Bhattacharya, A. 1997. An
Ergonomic Education and Evaluation Program for Apprentice Carpenters Amer. J.
Indust. Med, 32:641–646.
Andrews, C.J.A., Kornas, B. 1982. Ergonomics Fundamentals for Senior Pupils. Napier
College (now Napier University), Edinburgh.
Cowdery, I.M. and Graves, R., 1997, Ergonomic issues arising from access to patients in
paediatric surgery. Contemporary Ergonomics 1997. (Taylor and Francis, London.)
Grandjean, E, 1988. Fitting the task to the man: a textbook of occupational ergonomics.
(Taylor and Francis, London.)
Hagen, K.B., Kvein, T.K., and Bjørndal, A, 1997. Musculo-skeletal pain and quality of life
in patients with non.inflammatory joint pain compared to rheumatoid arthritis: A
population survey. The Journal of Rheumatology, 24.
Kompier, M., de Vries, M., van Noord, F., Mulders, H., Meijman, T. and Broersen, J. 1987.
Physical Work Environment and Musculo-skeletal Disorders in the Busdrivers
Profession. Musculoskeletal Disorders at Work (ed. P.Buckle) (Taylor and Francis,
London.)
Kuorinka, I., Jonsson, B., Kilbom, A., Vinterberg, H., Biering-Sorensen, F., Andersen, G.,
and Jorgensen, K. Standardized Nordic Questionnaire for the analysis of
musculoskeletal symptoms. Appl.Ergonomics, 18, 3, pp233–237.
Natvig, B., Nessiøy, I., Bruusgaard, D. and Rutle, O. 1955. Musculoskeletal symptoms in a
local community. Euro. J. Gen. Practice, 1, March.
McAtamney, L. and Corlett, E.N. 1992. Reducing the Risks of Work Related Upper Limb
Disorders: A Guide and Method. The Institute for Occupational Ergonomics,
University of Nottingham, Nottingham, UK.
Pheasant, S. 1991. Ergonomics, Work and Health. McMillan Press, London.
Torp, S., Riise, T., and Moen, B.E. 1996. Work-related musculoskeletal symptoms among
car mechanics: a descriptive study. Occ. Med. 46, 6, pp 407–413.
PREVENTION OF MUSCULOSKELETAL DISORDERS IN
THE WORKPLACE—A STRATEGY FOR UK RESEARCH

L Morris, R McCaig, M Gray, C Mackay,


C Dickinson, T Shaw and N Watson

Health and Safety Executive, Magdalen House,


Trinity Road, Bootle, L20 3QZ.

Research plays an important role in the Health and Safety Executive’s


(HSE) strategy for the prevention of musculoskeletal disorders, the leading
cause of occupational ill-health in the UK. Over forty musculoskeletal
research projects have been funded since the early 1980s and the findings
have assisted HSE in advising industry about the nature and extent of
musculoskeletal risks and appropriate control measures. HSE is currently
reviewing its musculoskeletal research portfolio with the aim of
mapping out research priorities for the next 5 to 10 years. This paper
describes the development of a musculoskeletal research strategy and
outlines its major themes including research on pathomechanisms, risk
factors, strategies for exposure assessment, health surveillance methods and
intervention studies.

Introduction
Musculoskeletal disorders (acute and chronic) are the leading cause of self-reported
occupational ill health in the UK with an annual prevalence now estimated at over 900,000
cases caused by work (Health and Safety Commission 1997). The reported conditions can be
grouped in four categories (McCaig 1996):-

u transient soft tissue pains related to poor work posture and task design
u discrete soft tissue lesions such as carpal tunnel syndrome
u chronic pain syndromes affecting the lower back and limbs
u chronic degenerative disorders such as osteoarthritis of the hip

The health consequences of these conditions range from transient aches and pains to chronic
problems which may lead to permanent disability. Acute injuries to the musculoskeletal
system, resulting from poor task design and overexertion, are also a cause for concern.
Handling injuries, for example, account for around a third of all over-3 day injuries reported
to HSE.
Prevention of musculoskeletal disorders in the workplace 47

Research plays an important role in HSE’s strategy for the prevention of acute and chronic
injury to the musculoskeletal system. HSE has to be able to get accurate information about
risk factors and control measures in order to undertake its core activities successfully,
including publication of guidance, publicity campaigns and workplace visits by inspectors.
All this work is heavily dependent on scientific knowledge drawn from the literature and
from HSE’s own research programme. Since the early 1980s, HSE has funded over forty
research projects on musculoskeletal issues and this together with the findings from in-house
field studies and technical investigations has helped to shape the advice given to industry.

In the past, HSE’s musculoskeletal research strategy has been incremental, research
questions being generated by ongoing policy development and operational activities. While
this approach had largely met organisational requirements, a review of occupational health
policy identified a need for a more strategic look at the programme, examining the range of
topics covered by current and completed projects and identifying significant gaps in
knowledge. The primary objective was to map out and prioritise research themes for projects
to be commissioned by HSE over the next five to ten years. The strategy also aimed to
identify topics for collaborative research at both national and international levels with a view
to maximising the benefits gained from limited research resources.

It is interesting to note that, internationally, a more strategic approach to musculoskeletal


research is being adopted as countries seek to ensure that programmes address agreed
national priorities. In the USA and Finland, for example, recent musculoskeletal research
programmes have focused on preventive measures and workplace interventions (Haartz and
Sweeney 1995, Viikari-Juntura 1995).

The Strategy Development Process


Recent publications have considered the process of research strategy development (Rantanen
1992, National Institute for Occupational Safety and Health (NIOSH) 1996). An important
element is consultation involving stakeholders (employer and employee organisations,
occupational health practitioners, research funding bodies etc.,) as well as experts and
researchers. The criteria for defining priority topics are driven largely by expert and
stakeholder opinion and experience shows that several iterations of the process may be
needed before consensus is achieved (NIOSH 1996).

In developing the HSE strategy, information on research needs was gathered from a wide
range of sources including:-

u a review of previous and current extramural research projects


u research workshops on specific topics (back pain, upper limb disorders
(Harrington et al 1996), diagnostic criteria)
u an overview of the scientific literature to identify emerging research themes
u musculoskeletal research strategies published by other funding bodies or
professional organisations eg., NIOSH (1996)
48 LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson

A strategy development group, with representatives from HSE’s technical, policy and
research interests was formed to structure the information and develop an initial draft. An
important aspect of the work was the development of a framework, which integrated research
needs related to all types of work-related musculoskeletal disorders as well as acute injuries
and manual handling accidents. This was linked to regulatory requirements for risk
assessment and control and was based on the ergonomics concept of the degree of match
between task demands and individual capabilities.

The research needs identified in the draft strategy were debated at a research seminar
attended by leading research workers, technical experts, occupational health practitioners and
representatives of employer and employee organisations. The seminar programme focused on
developing trends in musculoskeletal research with sessions on health outcomes,
psychosocial factors, exposure assessment and control. International developments in
research and standards were also considered. The views expressed in this forum are being
incorporated in a second draft of the strategy which will be subject to further rounds of
consultation.

An important requirement to be met in developing a research strategy is that it should be


widely communicated to stakeholders and feedback sought (Rantanen 1992). The HSE
strategy will ultimately be published, therefore, in order to encourage wider discussion and to
assist funding organisations in the planning of research programmes.

Emerging Research Themes


The review of the extramural projects showed that almost half of the HSE sponsored projects
had addressed issues connected with the development of manual handling guidance,
reflecting research needs associated with the introduction of the Manual Handling Operations
Regulations 1992. Comparatively fewer studies had been undertaken on the musculoskeletal
risks associated with keyboard tasks or clinical aspects such as disease mechanisms and
diagnostic criteria. Significant gaps were identified in relation to risk factors for the
development of upper limb disorders and the interactions between them. Other research needs
indicated by the review, included studies of the mechanisms of cumulative musculoskeletal
injury and the development of design guidelines for workplaces and tasks. While some of
these issues had been addressed in the scientific literature, further applied research was
necessary to develop practical workplace guidance, an important objective of HSE research.

The draft strategy paper built on this review and the earlier research workshops (Harrington et
al 1996), identifying research needs related to risk factors, exposure measurement, health
outcomes, health surveillance, medical management and workplace interventions. These
areas encompass the wide spectrum of issues associated with the assessment and control of
musculoskeletal risks in the workplace, methodological issues associated with
epidemiological research being included alongside practical management concerns. Table 1
summarizes some of the main research issues identified to date. These are presented in order
to illustrate the general direction being taken by the strategy and are subject to further
consultation before any consideration is given to funding.
Prevention of musculoskeletal disorders in the workplace 49

Table 1 HSE Musculoskeletal Research Strategy—Summary of Research Issues


50 LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson

Future Development
The development of any research strategy is a continuous process and the direction and
content of HSE sponsored musculoskeletal research is likely to change as research findings
are evaluated. The strategy is seen as a useful management tool for planning and
commissioning new research and has been designed with flexibility in mind. While the
strategy is based on current research trends, experience shows that some facility must be
made to accommodate unforeseen scientific developments. Musculoskeletal research draws
on a number of parent disciplines and new directions can emerge from basic research.
Advances in research into pain mechanisms, for example, have led to new methodologies for
investigating chronic pain syndromes in the upper limbs.

It can be concluded that there is much to be gained from a strategic approach to the planning
of musculoskeletal research. The consultative process involving a wide range of stakeholders
will help to ensure that limited research resources are appropriately targeted and that research
findings are evaluated against agreed objectives. A published national strategy, which is
regularly updated, also provides an ideal vehicle for collaboration, enabling governmental,
academic and industrial research resources to be shared in the pursuit of common goals.

References
Haartz, J.C. and Sweeney, M.H. 1995, Work-related musculoskeletal disorders: prevention
and intervention research at NIOSH, In Nordman, H. et al, Sixth US-Finnish Joint
Symposium on Occupational Health and Safety, Research Report 3, (Finnish Institute
of Occupational Health, Helsinki), 135–141
Harrington, J.M. et al 1996, Work related upper limb pain syndromes—origins and
management, Unpublished report on research priorities workshop, (Institute of
Occupational Health, Birmingham)
Health and Safety Commission 1997, Annual Report and Accounts 1996/97 (HSE Books,
Sudbury)
McCaig, R.H. 1996, Managing musculoskeletal disorders—an overview from a medical
perspective, Proceedings, Ergonomics and Occupational Health -
ManagingMusculoskeletal Disorders, London, 3 December 1996, (The Ergonomics
Society, Loughborough)
National Institute for Occupational Safety and Health 1996, National Occupational
Research Agenda, DHHS(NIOSH) Publication 96–115, (NIOSH, Cincinnati).
Rantanen, J. 1992, Priority setting and evaluation as tools for planning research strategy
Scandinavian Journal of Work, Environment and Health, 18, Suppl 2, 5–7
Viikari-Juntura, E. 1995, Prevention program on work-related musculoskeletal disorders, In
Nordman, H. et al, Sixth US-Finnish Joint Symposium on Occupational Health and
Safety, Research Report 3, (Finnish Institute of Occupational Health, Helsinki),
151–154

The opinions expressed in this paper are those of the authors and do not necessarily
reflect the views of the Health and Safety Executive.
A MUSCULOSKELETAL RISK SCREENING TOOL FOR
AUTOMOTIVE LINE MANAGERS

A Wilkinson*, RJ Graves*, S Chambers**, R Leaver**

* Department of Environmental & Occupational Medicine


University Medical School, University of Aberdeen
Foresterhill, Aberdeen, AB25 2ZD
** Occupational Health Department
Land Rover, Rover Group
Solihull

The routine assessment of musculoskeletal disorders (MSD) risk at line


management level in industry is an important step in risk management. Tools
such as Rapid Upper Limb Assessment etc., provide various means of trying
to integrate MSD risk assessment but appear to use differing criteria and
emphasise risk to different parts of the body. A study was undertaken to
develop a Statutory Musculoskeletal Assessment Risk Tool (SMART) and to
assess the effectiveness of the company’s current risk assessment tool. Two
groups of twenty line level managers acted as subjects, one group using the
old tool and the other the new tool. The results showed that there was an
improvement in the accuracy and sensitivity of risk identification using the
new tool.

Introduction
The routine assessment of musculoskeletal disorders (MSD) risk at line management level in
industry is an important step in risk management. Tools such as RULA (McAtamney and
Corlett, 1993) and OWAS (Kant et al, 1990), supplemented by the Health and Safety
Executive’s Manual Handling Operations Regulations (HSE, 1992) provide various means of
trying to integrate MSD risk assessment. The former tools appear to use differing criteria and
tend to emphasise risk to different parts of the body. As a first stage in any statutory risk
assessment, there is a need for a tool that helps users to identify risks as defined by the
MHOR and those risks which can lead to Upper Limb Disorders indicated by criteria such as
that used by the HSE (1990, 1994).
As part of an initiative to reduce the incidence of work related MSD on site through
increased awareness, an automotive company intended to make its line managers responsible
for screening for sources of risk within their work areas. An existing tool (Associate joB
Analysis, ABA, BMG AG/Rover Group, 1996) was available but there was some concern that
it did not highlight sources of MSD risk accurately enough. The company wished for
screening tools which were simple and straightforward to use to enable accurate, quick and
reliable assessments to be made of risk from individual jobs without the need for detailed
training in ergonomics. In addition, the tool(s) needed to provide enough information so that
52 A Wilkinson, RJ Graves, S Chambers and R Leaver

task requirements could be assessed to help in initial job placement, rotation and/or
rehabilitation, not necessarily by managers but by the occupational health staff.
A study was undertaken to develop a statutory based manual handling and
musculoskeletal risk assessment tool. This effectively provided a first stage indication of
potential risk using the MHOR and so could be termed a Statutory Musculoskeletal
Assessment Risk Tool (SMART). In addition the study needed to assess the effectiveness of
the company’s current risk assessment tool (Wilkinson, 1998).

Approach

Overview
First, a prototype SMART was developed by examining the literature to identify appropriate
criteria. This consisted of two sections. The first was a modification of a MHOR worksheet to
be used to determine if any of the tasks exceeded MHOR guidance. The second section was
to be used to determine whether there were other MSD risks. Each section was intended to
provide risk scores to indicate whether there was high risk (red), medium risk (amber) and
low risk (green). The design was intended to be as visual and simple enough so that line
managers could use it routinely.
For the experimental evaluation of the two assessment tools, video recordings of tasks
from a vehicle assembly line were compiled to represent a range of typical work activities
found on site. The tasks were selected to cover a wide range of postural work activity and
load handling.
Two groups of twenty line level managers acted as subjects, once they had been trained,
one group using the ABA tool and the other the new tool. The time taken and accuracy of the
assessments were recorded. A team consisting of an ergonomist, an occupational physician,
two physiotherapists, two occupational health nurses and a health and safety officer,
determined the ‘gold standard’ levels of risk present in each of the tasks. The latter were used
as a base line for analysing the results from the experimental study.

Stage 1
This involved developing a prototype SMART. The first part of this involved developing a
summary sheet using job process sheets (internal job descriptors) as a basis for breaking
down the job into elements. The next two sections concerned manual handling and general
musculoskeletal risk assessments.
The manual handling section reflected the needs of the MHOR and used diagrams to take
the user through the assessment process. As there could be load reduction factors which
needed to take account of factors such as repetition etc., correction factors in tabular form
were provided. Where a high risk was identified this implied that a more detailed assessment
would be needed.
The section on general musculoskeletal risk took account of different body parts including
the neck, shoulders, back, wrists and hands. Generally postures were assessed against criteria
such as repetition, force and duration to provide an integrated score.
Examples of line tasks were selected for the study and included one task for training
purposes, and ten for the experimental assessments. Tasks were chosen to cover aspects of
high, moderate and low risk levels in relation to MHOR, RULA and OWAS.
Musculoskeletal risk screening tool for automotive line managers 53

Actual risk levels present in each task, and the key causal factors of that risk were established
to the satisfaction of a professional panel of judges including an Occupational Physician, a
physiotherapist and an ergonomist. This provided the gold standard referred to earlier.
For the pilot phase of the study subjects were selected from occupational health staff
members to ensure some level of ergonomic awareness based on previous health and safety
training. The subjects worked through an assessment of the training task using the SMART,
and then were asked to use it to make assessments of the ten experimental tasks.
The results of the pilot study were compared with those of the professional team. The
prototype SMART was reworked and amended on the basis of these results and comments
from the subjects.

Stage 2
This involved comparing groups on performance in terms of accuracy, speed and ease of use
using the prototype SMART and ABA assessment tool.
Forty line level managers (as a representative sample of the intended users of the final
form) were selected to take part in the experimental phase; of whom at least twenty were pre-
trained in the use of the current ABA form. A summary sheet was produced for use with the
ABA assessment, so that the output from this form was comparable with the output of
SMART in terms of high, medium and low risk.
Twenty subjects pre-trained in ABA made up one experimental group (Non expert ABA).
They worked through the assessment of the training task using the ABA and the ABA
summary sheet, then completed the assessment of the ten experimental tasks using the ABA
form unassisted.
The remaining twenty subjects were used for the assessment of the prototype SMART
(Non expert SMART). These were trained in using the SMART by working through the form
and assessing the training task. Following this, they made the assessments of the ten
experimental tasks unassisted.
Analyses of accuracy were undertaken in relation to the risk levels for each factor by
comparing performance of the professional panel in using the ABA and SMART against the
non expert ABA and the non expert SMART groups. Similarly, comparisons between the
groups were undertaken to assess speed in relation to the time taken to make the assessments
of the ten experimental tasks, and usability in relation to a Likert scale evaluation of the
form’s clarity etc.
Accuracy was taken as the percentage of the risk assessments agreeing with the control
(gold standard). The assessment of accuracy depended upon comparing performance between
groups and recording false positive and false negatives. A false positive was defined as
finding a degree of risk where there was none i.e. the performance was oversensitive. False
negatives occurred where there was risk but this was not found i.e. performance was not
sensitive enough.

Results and Discussion


A false positive was defined as finding a degree of risk where there was none i.e. the
performance was oversensitive. False negatives occurred where there was risk but this was
not found i.e. performance was not sensitive enough.
54 A Wilkinson, RJ Graves, S Chambers and R Leaver

Figure 1 Comparison of overall accuracy assessment for non experts versus


experts for both tools

Figure 2 Comparison of accuracy of assessment for non experts versus experts


for both tools for neck assessments (group means)
Musculoskeletal risk screening tool for automotive line managers 55

Figure 1 (overleaf) shows a comparison of overall accuracy assessment for Non-Experts


versus Experts for both tools. Overall there were lower percentages of False Negative scores
when using the SMART compared to the ABA tool. This appears to indicate that performance
was more sensitive in detecting risk with the SMART.
Examining the accuracy of the Non-expert ABA versus the Expert ABA in more detail, it
can be seen that both had similar performance. Although they had similar Accuracy scores,
the Expert ABA group appeared to have marginally more. It can be concluded that
performance was reasonably similar using the ABA form. Examining the accuracy of the
Non-expert versus the Expert SMART groups, it can be seen that both had similar
performance in terms of Accuracy, although the Expert group had a lower percentage of
Accuracy scores.
Analyses were carried out for each of the sections of the SMART. The first involved
examining the manual handling assessments (Section one). This showed that both the Non-
expert and Expert groups had better sensitivity in detecting risk with the SMART.
The next section (MSD) covered necks, shoulders, arms, wrists, hands, backs and legs.
Figure 2 (overleaf) shows a comparison of accuracy of assessment for non experts versus
experts for both tools for neck assessments (group means). Both groups using SMART were
more accurate and had better sensitivity in detecting risk. The results for the other parts of the
body tended to be mixed. For example, risk with shoulders and arms tended to be detected
more accurately with the ABA tool but the SMART was more sensitive to detecting risk.
The results of the study indicate that SMART has the potential for overall accuracy and
sensitivity in detecting risk for line managers. As such it seems to be a sensitive means of
highlighting potential risk. There is, however, a trade-off between how much sensitivity is
needed in practice and the amount of risk in a specific task, and this needs to be examined in
more detail. In addition, more work should be undertaken to improve this tool in relation to
risk to certain parts of the body.

References
BMG AG/Rover Group 1996, ABA Associate joB Analysis, ABA Rev/6 DOC September,
1996, BMG AG/Rover Group
Health and Safety Executive 1990, Work related upper limb disorders. A guide to
prevention. HMSO, London
Health and Safety Executive 1992, Manual handling. Guidance on regulations Manual
Handling Operations Regulations 1992. HMSO, London
Health and Safety Executive 1994, Upper limb disorders: Assessing the risks. Health and
Safety Executive
Kant, I. Notermans, J.H.V. Borm, P.J.A. 1990, Observations of the postures in garages using
the Ovako Working Posture Analysing System (OWAS) and consequent workload
reduction recommendations, Ergonomics, 33, 2, 209–220
McAtamney, L., Corlett, E.N. 1993, RULA: a survey method for the investigation of work
related upper limb disorders, Applied Ergonomics, 24, 91–99
Wilkinson, A. 1998, Development of a manual handling and musculoskeletal risk
assessment screening tool for line managers at an automotive plant MSc Ergonomics
Project Thesis, Department of Environmental and Occupational Medicine, University
of Aberdeen: Aberdeen.
RISK ASSESSMENT DESIGN FOR MUSCULOSKELETAL
DISORDERS IN HEALTHCARE PROFESSIONALS

Caryl Beynon, Diana Leighton, Alan Nevill, Thomas Reilly

School of Human Sciences


Liverpool John Moores University
Mountford Building
Liverpool, L3 3AF

An ergonomic check-list was developed to assess the risk of performing


certain nursing and physiotherapy tasks. This was in response to extensive
epidemiological work identifying the magnitude of the problem.
Questionnaires were used to assess the musculoskeletal symptoms
experienced by nurses and physiotherapists and life-time prevalence was
49%. Low back/buttocks/upper legs was identified as the anatomical area
most affected. The risk assessment pro-forma was based on guidelines
provided by the Health and Safety Executive but was amended with
reference to the questionnaire results. A scoring system was devised so an
overall risk score for performing a specific task can be identified. The study
indicates the benefits of using epidemiology results when devising an
ergonomic risk assessment procedure.

Introduction
It was evident from review of the literature that nursing is frequently cited as an occupation
with a high risk of back problems (Hildebrant, 1995). This constitutes a huge financial burden
and potentially long periods of sickness absence from work. Whilst a plethora of studies
concerning back pain within the nursing profession exist, this area of research is rarely
expanded to include other anatomical sites. Rarely have other healthcare professionals been
cited in the literature. Physiotherapists are often neglected in research, possibly because it is
assumed that they have superior understanding of body mechanics and in particular back
protection (Molumphy et al., 1985). In order to quantify the prevalence of various
musculoskeletal disorders and to enable comparisons to be made between the nursing and
physiotherapy professions, comprehensive epidemiological investigations must be
undertaken.
While epidemiology is important in giving a preliminary overview of the problem, it is
then necessary to establish the possible causes of this occupational strain using more
objective measures. Only then can guidelines to reduce the risk factors be implemented. The
Musculoskeletal disorders in healthcare professionals 57

aim of this study was to utilise the results from an epidemiological investigation in order to
develop a risk assessment procedure for this to be achieved.

Epidemiology of Musculoskeletal Disorders

Methodology

A confidential questionnaire designed for self-administration was utilised within a cross-sectional


investigation. Questionnaires were distributed to 4220 nurses within 7 hospitals and 794
physiotherapists in 20 hospitals. Head manager nurses/superintendent physiotherapists distributed
the forms either directly to the sample group or via the heads of wards, depending on the
numbers involved. The questionnaires were distributed randomly to gain a cross-section of
ages, grades, specialties and gender. The results were analysed using the SPSS statistical package.
Chi-squared and logistical regression tests were utilised in the analysis of the data.

Results

A response rate of 44% (n=349) was obtained for the survey of physiotherapists; the
questionnaire was completed by 19% (n=813) of the nursing personnel sampled. The sample
characteristics of both populations are shown in Table 1.

Table 1. Sample characteristics of questionnaire respondents (means and standard


deviation are reported)

The lifetime prevalence of musculoskeletal disorders of various locations was 49%. The
point prevalence was 20.7%. Almost half of the sample (42.2%) who had suffered symptoms
at any time of their working lives were therefore exhibiting symptoms at the time of the
questionnaire. Point prevalence gives an indication of the immediate impact, but many
sufferers reported reoccurring symptoms which may not have been present at the time of
questioning and would therefore be undetected in the study if life-time prevalence had not
also been indicated.
58 C Beynon, D Leighton, A Nevill, and T Reilly

Respondents indicated the site of musculoskeletal symptoms on an anatomical diagram;


these sites were grouped into specific areas for analysis. The anatomical areas and
corresponding percentage of individuals who had experienced symptoms in these areas are
shown in Table 2. There was no significant difference in the relative percentages of nurses
and physiotherapists who had suffered a musculoskeletal disorder during their working life
(p<0.05). However, the location of disorders was significantly different (p<0.05) between the
two samples. Physiotherapists suffered more symptoms relating to the wrist, fingers, hand
and forearm, knee and lower limb than the nursing sample.

Table 2. Percentage of individuals that had experienced symptoms in each defined


anatomical area at some period in their working life

Perceived causes
Regarding lifetime prevalence, 36.4% of respondents with musculoskeletal symptoms could
recall a specific causal incident. Patient handling and lifting was indicated as the cause by
66.7%. Similarly, of those personnel who attributed their symptoms to continued exposure to
a stressor, patient handling and lifting was implicated by 51.3% of respondents.
Logistic regression analysis was employed to indicate factors with predictive value for
musculoskeletal disorders in general and back pain specifically. Performing manual lifts and
the number of lifts performed by the nurses and physiotherapists were not a significant
indicator of the prevalence of musculoskeletal symptoms. However, other factors were shown
to have predictive value but are beyond the scope of this paper.

Development of the ergonomic risk assessment pro-forma


The results of the epidemiological survey indicated a need to develop an objective risk
assessment procedure. This assessment pro-forma was developed based upon the guidelines
provided by the Health and Safety Executive, and incorporates information relating to
occupational, environmental, organisational and personal factors. Pilot work was performed
on a range of personnel by attending a variety of hospital wards and physiotherapy
departments at Southport and Formby District General Hospital to ensure all typical actions
could be recorded. The results of the epidemiological study were also used in the risk
assessment development. For example, the questionnaire indicated the relatively high
proportion of physiotherapists with problems in the wrists and fingers, so a section indicating
finger and wrist force was included in the risk assessment pro-forma.
Sub-sections of the check-list detailed task, posture, load, environmental conditions, the
psychological state of the individual and forces acting on the wrists and fingers were
included. An example of one sub-section is given in Table 3.
Musculoskeletal disorders in healthcare professionals 59

Table 3. Example of one sub-section used in the risk assessment pro-forma

A scoring system was devised for each sub-section and totalled to indicate the overall
measure of risk for the specific activity. Certain tasks/postures are assigned a score depending
on risk; for example, trunk flexion of 45° scores 2, compared to flexion of 90° which scores
4. A short description of the task was included at the time of recording so that a composite
score was associated with specific activities. Considering numerous individuals and
collecting a large sample of data will reduce any large individual differences by providing
mean scores for performing a specific task. A total of 45 hours of risk assessment data will be
collected, with risk assessment performed every 10 minutes within each hour period.
Data collection is currently underway and results will be compared with the questionnaire
responses to determine exactly which aspects of the profession are detrimental in terms of the
onset of musculoskeletal disorders. Of those respondents who attributed their symptoms to a
single event, 66.7% identified patient handling and lifting as the cause. Similarly, of those
personnel who attributed their symptoms to continued exposure to a stressor, patient handling
and lifting was implicated by 51.3% of respondents. Patient handling is frequently cited as
the most common cause precipitating a period of low back pain in both nursing (Jensen,
1990) and physiotherapy (Bork et al., 1996) but the logistic regression failed to find lifting
per se as having a predictive value for the onset of musculoskeletal disorders. The risk
assessment will indicate which specific lifting and handling tasks, or components of tasks,
are detrimental and other factors which may have been over looked with the pre-occupation
with manual handling.
The epidemiological analysis identified high and low risk specialties. A range of both high
and low risk wards were chosen for the risk assessment, to be performed at a District General
Hospital on Merseyside. The assessor ‘shadows’ one member of staff for a one-hour period
during the course of their working day and an instantaneous assessment is performed every
10 minutes. By remaining with the member of staff continuously for the whole hour, the
assessor is also able to assess the psychological characteristics of the individual which were
shown to be important in the questionnaire analysis. Assessments take place at different times
of the day, and incorporate personnel of both sexes, and a range of grades to ensure a cross-
section of information is collected.

Conclusion
It is apparent that nurses and physiotherapists are highly susceptible to experiencing
musculoskeletal disorders of a perceived work-related origin. It also apparent that these
healthcare professionals perceive patient handling tasks to be instrumental in the onset of
symptoms. This objective risk assessment procedure, developed in response to results of
60 C Beynon, D Leighton, A Nevill, and T Reilly

extensive epidemiology, may be applied within the work environment to establish the overall
risk of performing various occupational tasks. Data collection and analysis are underway for
exploring which specific occupational activities are incurring the greatest risk to personnel.
Most importantly, the observation check-list may be completed manually, is a quick, non-
intrusive method of collecting large amounts of data.

References
Bork, B.E., Cook, T.M., Rosecrance, J.C., Engelhardt, K.A., Thomason, M.E.J, Wauford,
I.J. and Worley, R.W. 1996, Work-related musculoskeletal disorders among physical
therapists, Physical Therapy, 76, 827–835
Hildebrandt, V.H. 1995, Back pain in the working population: prevalence rates in Dutch
trades and professions, Ergonomics, 38, 1283–1298
Jensen, R.C. 1990 Back injuries among nursing personnel related to exposure, Applied,
Occupational and Environmental Hygiene, 5, 38–45.
Molumphy, M., Unger, B., Jensen, G.M. and Lopopolo, R.B. 1985, Incidence of work-
related low back pain in physical therapists, Physical Therapy, 65, 482–486
ERGONOMIC MICROSCOPES—SOLUTIONS FOR THE
CYTO-SCREENER?

J.L.May & A.G.Gale

Applied Vision Research Unit, University of Derby,


Mickleover, Derby DE3 5GX, UK

Previous studies have found that microscope users demonstrate a high


number of visual and postural problems potentially resulting in poor
productivity, discomfort and fatigue. These problems have potential serious
consequences particularly in the medical setting as they may lead to errors in
medical diagnosis. This paper reviews international literature and details
some of the findings of previous ergonomic studies regarding microscopy.
The main problems found in microscopy both in industrial and medical
settings are detailed. Potential solutions to some of the problems identified
and the practicality of applying these to a cytology laboratory are discussed.

Introduction
Many assembly and inspection tasks in the electronics industry require the use of a
microscope to ensure efficient manufacture and quality control. Microscopes are also used
widely in medical and research settings to analyze closely cultures or samples of cells to aid
diagnosis and treatment of various diseases. The intensive use of microscopes in some
situations however has led to reported problems. The majority of problems appear to fall into
two categories; visual and postural.

Visual problems
One of the first documented accounts of what is termed “operational microscope myopia” is
by Druault (1946) who reported that near sightedness (myopia) and double vision in
physicians was the result of undue accommodation and convergence caused by the use of
microscopes. In cytology screening where microscope use is intensive, more recent studies
have found that 73% (Hopper et al, 1997) of cytology screeners reported eye strain. Similar
levels exist in the electronics industry. Frenette and Desnoyers (1986) conducted tests of
visual fatigue on cyto-technicians both before and after they started work and compared the
results with a group of haematologists, who did not use a microscope as intensively. Some
31% of cyto-technologists showed symptoms of blurred vision at the end of the day compared
with just 3% of haemotechnolgists. Elias (1984) also reported an increase in the prevalence of
visual symptoms for microscopists working more than 20 hours per week compared to those
working less. Reasons for visual fatigue may include:
62 JL May and AG Gale

• Long periods of Accommodation. Intense periods of microscope work require the eyes to
perform long periods of accommodation possibly resulting in the condition referred to as
“temporary or operational myopia”, (Zoz et al, 1972).
• Ophthalmic factors. Various ophthalmologic factors such as long-sightedness or
astigmatism increase a person’s susceptibility to visual fatigue, (Zoz et al, 1972).
• Microscope Illumination. If the illumination is too bright a distinct “pulling sensation”
can be felt (Burrells, 1977). Exposure to bright light may increase retinal detachment and
myopia, (Ostberg and Moss, 1984).
• Environmental Conditions. The microscopist may be subject to glare from reflective table
tops, high levels of illumination and sunlight. Insufficient air movement and humidity
may contribute to eye problems.

Postural Problems
Sustained voluntary and involuntary contractions of the ocular and neck muscles when using
microscopes can give rise to headaches, and stiffness in the neck (Simons, 1942). In industry
45% of microscope workers suffered from muscular ailments, (Soderberg 1978), while in the
medical field Hopper et al, (1997) found that a prevalence of 78% of cyto-screeners in the UK
reported muscular pain. It was reported to be experienced every day by 28.6%, while a further
53.7% experience muscular discomfort during each week. The areas of the body where discomfort
was most commonly reported were the neck, shoulders, upper and lower back and wrists. These
users found it harder to concentrate and had lower job satisfaction. Key factors are:

• Maintaining a Fixed Posture. Microscope work involves users adopting a fixed position
for long periods of time. The number of hours spent in a fixed position while using a
microscope was related to the reporting of discomfort symptoms in the neck and back,
(Grieg and Caple, 1987).
• Performing Small Repetitive Movements. Microscopists move slides by continuous
operation of the controls using small precise movements of the hands. The control
location often forces the user to adopt awkward hand/arm positions.
• Inadequate Microscope and Furniture Design. Various types of microscopes are not
adjustable in terms of eye piece height, angle or control position and this may lead to
muscular discomfort, (Soderberg, 1978). Unsuitable benches and chairs were often used
and inadequate space provided, (Hopper et al, 1997).
• Stress. Maintaining long periods of concentration results in a fatiguing effect on the user,
(Johnsson, 1981). The pressure of not making a diagnostic error is very high which again
can lead to additional stress and fatigue.

Implications/Recommendations

1) Minimize the need for microscopy


Many microscope jobs in health and industry have been eliminated by automation and
alternative viewing systems. Potential problems of these systems, however include
inadequate resolution and poor colour rendition which render them unsuitable for several
Ergonomic microscopes—solutions for the cyto-screener? 63

tasks. The introduction of new technology is also likely to pose new ergonomic problems to
be addressed.

2) Design and purchase suitable equipment and furniture


Recommended microscope viewing characteristics (e.g. Ostberg and Moss, 1984) are:-

• Eye pieces with built in artificial depth cues may help to reduce accommodation.
• The interpupillary distance (IPD) of the eyepieces should be adjustable between 50–
76mm corresponding to a viewing distance of 1.0–2.5 diapers and a convergence angle of
3–10 degrees.
• A flat field is necessary for a clear image at both the periphery and the centre of the visual
field. A detector device could warn operators when the visible light is too bright for long
term viewing.
• Use coloured filters to enhance filter out unhealthy parts of the spectrum
• To accommodate for differences in fusion capacity a phoria adjustment is needed.

Eye piece height and angle is a determining factor in neck angle and some of the problems
could be eliminated posturally if the eye pieces would adjust to allow the operator to adopt a
more horizontal line of sight (Soderberg, 1978). It should be possible to adjust the distance
from the front of the microscope to the front of the eye piece to accommodate individuals with
a large abdominal depth. To reduce stress on the shoulders and neck the focusing controls
should be brought down close to table top height. The maximum distance of the controls from
the edge of the table should be suitable for the smallest user to minimise the tendency of bending
forward. Some ability to adjust the position of the controls would be beneficial. Recent
improvements in microscope design have been made and some microscope manufacturers are
now starting to offer more ‘ergonomically designed’ microscopes incorporating some of these
features. Purchasing such modern microscopes and additional aids however may be beyond
financial possibility and the user may be left with an inadequate microscope for some time.
Kumar and Scaife (1979) analyzing measurements of muscle activity in microscope
workers found that even small changes in the workstation design such as the degree of incline
and height of the table top produced significant changes in work posture and muscle activity.
It was recommended that the height of the table and chair should be adjustable to reduce
stress on the back and shoulders. Thinner benches which still provide adequate stability are
needed and footrests need to be provided if the user is unable to rest their feet on the floor.
Sufficient working space for the task must be provided, (Soderberg, 1978).

3) Modify Existing Equipment


Modifications may range from being cost free to very expensive and again the extent to which
they are implemented. The extent to which recommendations are implemented is dependent
upon the financial support available. It is possible to buy “add ons” or inserts to adjust the
height and angle of the eye piece. To minimize the stress placed on the wrists by continuous
operation of the controls products are now becoming available which will by flicking a lever
or using a mouse move the stage and adjust the fine focus controls electronically. Cheaper
modifications can be achieved through placing stands under the microscope to adjust the
height and angle. While this reduces the need to bend over the microscope it also raises the
64 JL May and AG Gale

height of the controls further above bench height thus increasing the stress placed on the
shoulders. Any change in eye piece angle towards the operator will reduce strain on the neck.
The microscope can be tilted forward by placing blocks under the rear edge of the base. The
more it is tilted forward however then the more likely things are to fall off the stage. Padding
has also been provided in some laboratories to minimize sharp corners on benches and to rest
the hands on equipment which is cold to touch. Users have also adapted boxes and buckets as
footrests when they have been inadequately provided.

4) Minimize the time spent on task


Users who perform the most microscope work experience more discomfort and visual
problems (Rohmert and Haider, 1986) It is therefore important to minimize the time each
person spends using a microscope by alternating with other tasks. Where ergonomic
approaches regarding redesign of work tasks or equipment are not feasible rest breaks and
shortened working hours must be considered. This is difficult for cyto-screeners within the
UK as there are very few tasks with which they can rotate.

5) Provide Appropriate Training


New microscopists should be given training in how to set up their microscope. All
microscopists should be aware of how to use and adjust their workstation to minimise
discomfort. Both eye exercises and a physical exercise programme may reduce fatigue,
(Haines, and McAtamney, 1993, MacLeod and Bannon, 1973)

6) Provide Vision Screening for Microscopists


Individuals with astigmatism, myopia, and hyperopia may be unsuitable for intensive
microscope work (Olsson, 1985). It may therefore be beneficial to screen users initially for
existing visual problems and at regular intervals thereafter.

Conclusion
There is evidence of visual and musculoskeletal problems amongst microscope workers. Due
to the fixed posture required in microscopy it is important that equipment and furniture are
provided which are suitable for the user. Although ergonomic problems were originally
highlighted some time ago many of the difficulties regarding the usability of microscopes for
intensive purposes still remain. Some manufacturers have recently addressed some of the
issues concerning microscope design by providing adjustable features on particular models.
In some settings however the ability to afford these new microscopes has not been possible
and users are left to work with cheaper models which offer very little in terms of flexibility
and adaptability. Some companies also produce ‘ergonomic’ attachments which do improve
the adjustability of certain microscope features, but they are often only suitable to use on a
certain kind of microscope. It is also unclear if the range of adjustment that they give will be
suitable to all potential users in their own working environment. It is important therefore that
a wider ergonomic approach is considered when looking at the problems of microscope
usage. This should incorporate not only the design and layout of equipment and furniture
used but also the user’s job design, satisfaction and organizational pressures. Users should
also be screened to ensure their suitability for the job performed, and trained to set up their
workstation appropriately for their own needs and requirements.
Ergonomic microscopes—solutions for the cyto-screener? 65

Acknowledgment
We would like to thank the NHSCSP who kindly funded this work.

References
Burrells W., 1977, Microscope Technique. A Comprehensive Handbook for General and
Applied Microscopy. (New York, NY: John Wiley and Sons).
Druault A., 1946, Visual Problems Following Microscope Use. Annals Oculistique,
138–142,
Elias R., & Cail, F., 1984, Work with Binocular Microscopes—Visual and Postural Strain,
INRS Cashiers de notes documentaires, 117, 451–456.
Frenette B., & Desnoyers L., 1986, A study of the Effects of Microscope Work on the Visual
System. Proceedings of the 19th Annual Meeting of the Human Factors Association of
Canada, Richmond (Vancover), August 22–23, 1986
Grieg J., & Caple D., 1987, Optical Microscopes in the Research Laboratory, Proceedings
of the 24th Annual Conference of the Ergonomics Society of Austrailia. Melbourne.
Haines H., & McAtamney 1993, Applying Ergonomics to Improve Microscopy Work.
Microscopy and Analysis, July 15–17.
Hopper, J.A., May, J.L., & Gale, A.G., 1997, Screening for Cervical Cancer: The Role of
Ergonomics. In S.A.Robertson (ed.) Contempory Ergonomics 1997, 38–43.
Johnsson C.R., 1981, Cytodiagnostic microscope work. In O.Ostberg and C.E.Moss (1984),
Microscope work—Ergonomic Problems and Remedies, Proceedings of the 1984
International Conference on Occupational Ergonomics, Rexdale, Ontario, Canada.
Human Factors Association of Canada, 402–406.
Kumar S.W., & Scaife W.G.S., 1979, A Precision Task, Posture, and Strain, Journal of
Safety Research, 11, 28–36
Olsson A., 1985, Ergonomi I mikroskaparbete, Rifa AB, Stockholm.
Ostberg O., & Moss C.E., 1984, Microscope work—Ergonomic Problems and Remedies,
Proceedings of the 1984 international Conference on Occupational Ergonomics,
Rexdale, Ontario, Canada. Human Factors Association of Canada, 402–406.
MacLeod D., & Bannon, R.E., 1973, Microscopes and Eye Fatigue. Industrial Medicine and
Surgery, 42, (2) 7–9.
Rohmert W., Haider E., Hecker C., Mainzer, J., & Zipp, P., 1986, Using a Microscope for
Visual Inspection and Repair of Printed Circuits, Ceramic Films and Microchips.
Zeitschrift fur Arbeitswissenschaft.
Simons D., Day, E., Goodell, H., & Wolf, H., 1942, Experimental Studies on Headache:
Muscles of the Scalp and Neck as Sources of Pain, Research Publications
Association, 23:228–244.
Soderberg, I., 1978, Microscope work II. An Ergonomic Study of Microscope Work at an
Electronic Plant. Report No. 40, National Board of Occupational Safety and Health,
Sweden.
Zoz, N., I., Kuznetsov, J., Lavrova, M., & Taubkina, V., 1972, Visual Hygiene in the Use of
Microscopes, Gigiena Truda i Professional’nye Zabolevanija, 16 (2) 5–9.
MUSCULOSKELETAL DISCOMFORT FROM DANCING IN
NIGHTCLUBS

S L Durham and R A Haslam

Health & Safety Ergonomics Unit


Department of Human Sciences
Loughborough University
Loughborough, Leicestershire, LE11 3TU

This paper provides preliminary evidence of the extent of musculoskeletal


discomfort arising from dancing in nightclubs. Subjective data were
collected by means of a postal questionnaire (n=50), and structured
interviews (n=50). An experiment was undertaken, measuring ankle
deceleration during dancing, allowing different floor/shoe combinations to
be compared. A high proportion of survey respondents reported having
musculoskeletal discomfort at some time (86% postal survey; 84% interview
respondents), a notable proportion of those questioned in the nightclub
(52%) reported discomfort at the time of the interview. For the experiment,
10 subjects danced under 4 conditions. Ankle decelerations as high as 18.3g
were measured on the hard-floor/hard-shoe condition. Further research is
recommended to confirm the nature and extent of the problem.

Introduction
Musculoskeletal disorders affect a significant proportion of the population at some time in
their lives. While considerable research has and is being undertaken concerning the problem
in the workplace and with respect to competitive sport, less attention has been paid to
musculoskeletal injury in the general population in connection with leisure activities.
Attendance at nightclubs and ‘raves’ (all night dance parties) is booming, with estimates
that several hundred thousand individuals participate every week (Jones, 1994). ‘Ravers’
dance for long periods of time, on hard unyielding floor surfaces, such as concrete, while
wearing poor shock absorbing footwear. This investigation was prompted by concern among
the ‘rave’ community regarding aches and pains experienced during and after dancing.
Additional factors that might be involved are use of drugs and elevated body temperatures,
both of which might mask the onset of discomfort, exacerbating the problem. McNeill and
Parsons (1996) found 76% of respondents to a questionnaire survey reported use of drugs
such as 3,4-methylenedioxymethamphetamine (ecstasy). In a thermal chamber experiment,
recreating dancing in night club conditions for an hour, McNeill and Parsons measured deep
Musculoskeletal discomfort from dancing in nightclubs 67

body temperature increases to 38.2 °C and rises in mean skin temperature close to deep body
temperature.
The research reported by this paper sought to provide preliminary evidence of the extent
of musculoskeletal discomfort arising from dancing in nightclubs. The research involved
three studies: a postal survey, structured interviews in nightclubs, and an experiment where
ankle decelerations during dancing were measured, enabling different floor/shoe
combinations to be compared.

Postal Survey

Method
A postal questionnaire survey was undertaken to obtain initial data, with particular reference
to longer term effects. The questionnaire was based on the Nordic Musculoskeletal
Questionnaire (Kuorinka et al, 1987), and was distributed on a convenience basis to 100
regular nightclub attendees. Participants were recruited through relevant Internet
newsgroups, nightclub/promoter mailing lists and personal contacts.

Results
The survey achieved a response rate of 50%, with 50 completed questionnaires (25 male, 25
female). Results are summarised in table 1.

Table 1. Summary of results from postal survey

A small but significant negative association was found between time spent dancing and
symptoms (Pearson r=-0.36, p<0.05). This may be because respondents within this sample
experiencing discomfort restrict their dancing. No other significant relationships were found.
68 SL Durham and RA Haslam

Nightclub Interviews
Additional data were collected by means of an interview survey of dancers in a London
nightclub. The nightclub was selected on the basis of its concrete dance floor, a hard surface
thought likely to maximise problems. The music played on the night of the survey was of
styles ‘hard techno’, ‘hard trance’ and ‘gabba’, music with a high number of beats per
minute, likely to encourage hard, fast dancing.

Method
Interviews were undertaken by 5 interviewers, who approached individuals resting or
walking around that had previously been observed dancing. Interviews took place in the early
morning, between 0300–0600 hours, to allow time for dancing to have taken place. The
interview schedule was adapted from the postal questionnaire, having similar content.

Results
The survey collected data from 50 participants (34 male, 16 female). Results are summarised
in table 2.

Table 2. Summary of results from nightclub interviews

A significant positive association was found between time spent dancing and symptoms
(Pearson r=0.67, p<0.05). No other significant relationships were found within the data.
Musculoskeletal discomfort from dancing in nightclubs 69

Laboratory Experiments
The purpose of the laboratory experiments was to allow the effects of different floor/shoe
combinations to be examined.

Method
Subjects were 10 university students, 5 male, 5 female, mean age 21.5 (±2.1), all regular
nightclub attendees. Decelerations were measured using 2 accelerometers, positioned at the
left ankle at the base of the fibula and at the mid-lumber region of the lower back. Subjects
were asked to dance to an extract of music in their usual manner, under 4 conditions.
Deceleration data were logged for 1 minute within each condition. The 4 conditions were: (1)
hard footwear, hard floor; (2) hard footwear, soft floor; (3) soft footwear, hard floor; (4) soft
footwear, soft floor. The order of conditions was balanced across subjects.

Results
Deceleration data were analysed by identifying the 10 peak decelerations for each subject and
calculating the mean for the ankle and lower back, figures 1 and 2.

Figure 1. Deceleration at Ankle

A mean peak deceleration as great as 18.3g was measured for one subject on his ankle.
The highest mean peak deceleration measured at the lower back position was 10.5g. A
significant interaction was found between deceleration and floor surface (p<0.05), with
the hard flooring having the highest decelerations. No other significant relationships were
found.
70 SL Durham and RA Haslam

Figure 2. Deceleration at Lower Back

Discussion
The results indicate the ‘ravers’ participating in this study engage in energetic, high impact
dancing, often while under the influence of alcohol or other drugs. The high proportion of
survey respondents reporting musculoskeletal discomfort is notable, with levels exceeding
80% in both surveys. The main sites of discomfort were the lower back, knees and ankles/
feet. The laboratory experiments found high decelerations and demonstrated an effect of floor
surface on ankle deceleration.
Caution is needed drawing conclusions from this study. The self-selected nature of the
postal survey sample could have resulted in disproportionate representation of those with
musculoskeletal problems. It is considered the results of the interview survey are less likely
to have been affected by selection bias, but the possibility of sampling effects remains. It is
clear that use of intoxicants could have affected responses of both groups of survey
participants.
It is concluded there is sufficient evidence of a problem to warrant further research.

References
Jones D, 1994, Rave New World. Programme transcript: Equinox, November 1994 (Channel
4: London)
Kuorinka I, Jonsson B, Kilbom Å, Vinterberg H, Biering-Sørensen F, Andersson G,
Jørgensen K, 1987. Standardised Nordic Questionnaires for the analysis of
Musculoskeletal systems. Applied Ergonomics, 18, 233–237
McNeill M and Parsons KC, 1996, Heat stress in night-clubs. In: Contemporary
Ergonomics 1996, edited by Robertson SA (Taylor & Francis: London), 208–213
MANUAL HANDLING
IS THE ERGONOMIC APPROACH ADVOCATED IN THE
MANUAL HANDLING REGULATIONS BEING ADOPTED?

Kevin Tesh

Senior Ergonomist
Institute of Occupational Medicine,
8 Roxburgh Place, Edinburgh EH8 9SU

This paper describes the results of employers’ responses to the Manual


Handling Operations Regulations in terms of reducing risks using the
ergonomic approach. Risk reduction factors considered how the work was
done (the task); what was handled (the load); where the load was handled
(the working environment) and who handled the load (the individual). The
range of measures reported showed that the new ergonomic approach
advocated was being taken on board generally by organisations, but the
practical implementation of some of the risk reduction measures was not
always effective as indicated from a small number of follow-up site visits.

Introduction
Manual handling has long been recognised as a major cause of occupational injury and ill-
health, In 1994/95 over 115 million days of certified sickness absence were attributed to back
problems and more than 50,000 work-related handling injuries were reported to the enforcing
authorities.
In 1982 the Health and Safety Commission (HSC) circulated a consultative document on
new Regulations and Guidance relating to manual handling at work (HSC, 1982). This represented
a major departure from previous legislation on this topic in that they sought to emphasise that
the risk of injury from manual handling was not simply a function of the weight being handled,
by describing an ergonomic approach to identifying the sources of risk of injury in manual
handling activities. This approach can now be seen in the Council of the European Communities
(CEC) 1990 European Directive on the minimum health and safety requirements for the manual
handling of loads (CEC, 1990). This directive led to the publication of the Manual Handling
Operations Regulations (MHORs) (HSE, 1992) and associated guidance which came into force
in the UK on 1 January 1993. British industry at least those businesses who are aware of the
Regulation has been trying to comply with the duties since that date.
In 1996 the HSC decided to evaluate (he effectiveness of these Regulations and Guidance
against a background of a wider review of a whole range of health and safety legislation
recently imposed on industry. The Institute of Occupational Medicine (IOM) in Edinburgh
was commissioned by the HSE to conduct a large-scale industry wide survey as part of that
Adoption of the Manual Handling Regulations 73

process (Tesh et al., 1997). The IOM had tested the usability of the Guidance by non-
ergonomists in the workplace before the Regulations were implemented, resulting in a more
effective document (Tesh et al., 1992).
The main aims of the project were to study both employers’ and employees’ awareness of,
interpretation of and response to the Regulations; to evaluate their appropriateness and to see
how organisations went about implementing the legislation. This paper describes and
discusses results of employers’ responses to the Regulations in terms of reducing manual
handling risks using the ergonomic approach.

Methods
An employer survey questionnaire addressed all issues relating to the Regulations and Guidance.
The questionnaire asked about knowledge and relevance of the Regulations, their implementation,
costs and benefits as well as the usefulness of the Guidance provided. Particular attention was
paid to making the questionnaire easy to complete to encourage a good response rate.
A questionnaire was posted to a stratified sample of 5,000 employers covering the
following ten industrial sectors: Manufacturing; Construction; Wholesale and Retail;
Agriculture/Horticulture; Transportation/Communications; Finance; Local Government;
NHS and Ambulance Trusts; Fire Brigades; and Services. Organisations were selected from
the single person self-employed up to large companies with 100 or more employees
throughout Britain. Responses to this questionnaire were weighted to allow for distribution of
the sample by size and sector.
The study was particularly interested in the steps taken by employers to reduce manual
handling risks, and so the main factors that employers most consider when making an
assessment of manual handling operations were included in the questionnaire under the
headings of: the tasks; the loads; the working environment; the individual and training. All
factors listed in Schedule 1 of the Regulations could not be considered as this would have
significantly increased the length of the questionnaire, thereby discouraging respondents to
complete and return the questionnaire.
In order to validate appropriate sections of the employer questionnaire responses, nineteen
follow-up company site visits were conducted. The results of these visits are not fully
discussed in this paper although some explanatory information gathered from these visits are
used to supplement the employer responses on the risk reduction strategies. Information on
the perception and awareness of the Regulations and Guidance amongst employees was also
gathered using two approaches, viz. through trade union representatives and through
employers during the follow-up company visits. The results of this employee survey are not
addressed in this paper.

Results
Employers were asked to tick the main examples of reducing manual handling risks if they
had both heard of the MHORs and had at least partly implemented the Regulations within
their organisation. The responses from the 5,000 employers were weighted by the distribution
of organisations in Great Britain, in order to give a representative picture of Britain as a
whole. The results quoted are the weighted figures.
74 KM Tesh

The Task
All the examples listed in the questionnaire relating to reducing the risk by altering the task
were used. Except for reducing pushing and pulling efforts, each of the approaches had been
employed by 40–60% of the organisations. Reduction of lifting from floor and above
shoulder level appeared to be the most popular measure along with reducing carrying
distances, while reductions in pushing and pulling efforts were only implemented by 27% of
the organisations. This possibly reflects the relative ease with which such measures could be
introduced.
NHS Trusts had a high percentage who reduced the amount of twisting and stooping and
lifting from floor level. Reducing lifting from floor level or above shoulder lever was lowest
in the Transport sector. Although reducing pushing and pulling was generally low, this
example was mentioned by a high percentage in the NHS Trusts and Finance sectors.
The relatively low score for reducing pushing and pulling under the task category may be
because respondents do not easily identify measures to reduce this particular risk such as improved
trolley maintenance procedures. However other measures such as improving floor conditions
and the layout of the workplace, under the environment category can also have the same desirable
effect. Similarly reducing the load, under the load category, will also result in lower pushing
and pulling efforts. Clearly, the categories are interrelated and can influence each other.
Varying the work was the least frequently adopted risk reduction method mentioned by
the Fire Brigade. Varying the work was more prevalent in larger organisations, presumably
because they would have the flexibility to do this. Generally all examples were stated more
often by larger companies.

The Load
Relatively few organisations had addressed elements under the ‘load’ category to reduce
manual handling risks, compared with changing the way an individual undertook the job (ie.
the task). Making a load easier to handle (16%), more stable (18%) and safer to handle (17%)
were reportedly tried less often than the 40–60% achieved under the task category. The
exceptions were reducing the size or weight of load (35%) and providing employees with
more information about the loads, which was the most commonly stated means of reducing
the risk attributable to the load itself (57%). This approach was generally high in all sectors,
with Agriculture the lowest.
The widespread use of the provision of information as a risk-reduction technique is
possibly because this is also a specific requirement under Regulation 4(1)(b)(iii). Providing
load information would also be expected to be high as it is a non-physical change, and in most
instances would be an easier and cheaper option to implement in order to reduce the manual
handling risks associated with the load. These findings could be interpreted as suggesting that
organisations had less scope for modifying characteristics of the load and provide further
justification for the emphasis on factors other than the load, such as who, where and how the
load is handled
The next most commonly stated example was reducing the weight or size of the load,
which was highest in NHS Trusts, Local Government and lowest in Wholesale/Retail and
Services. There were no strong relationships between size of organisation and the methods
used, except that larger organisations tended to give employees more information compared
with smaller organisations.
Adoption of the Manual Handling Regulations 75

The Working Environment


Clearing obstructions, improving the layout of workplaces, and improving the lighting, were
introduced by almost half (48%) and over a third (36% and 34%) of the organisations
respectively under the environment risk reducing category, In most cases these risk reduction
measures could simply be introduced by improving the standard of housekeeping to allow
handlers more room and clearer access along handling routes to adopt more acceptable
postures. As a consequence manual handling risks in the working environment could be
reduced or eliminated relatively easily and cheaply by adopting good housekeeping
measures. Less than one quarter of respondents mentioned evening up floors, reducing risks
associated with steps and ramps, and controlling temperature and draughts.
NHS Trusts had the highest percentage who improved the layout and cleared obstructions.
Construction, Manufacturing, Finance and Local Government had high prevalence of
clearing obstructions as a means of reducing risks. There were clear trends in the data for size
of organisation. Almost all means of reducing risk by altering the environment were more
frequently used in the larger organisations.

The Individual
Risk reduction measures on ‘selecting the individual’ and ‘providing training’ (together with
‘other factors’) were implemented by between a third and almost two thirds of the
organisations. Providing training in handling techniques was the most common, which was
virtually 100% in the sectors of Local Government, NHS Trusts and the Fire Brigades.
Considering the relatively high number who identified this as a requirement of the
Regulations, this result is not surprising. Along with other information and training methods,
this provision in most cases is the easiest to implement and therefore would tend to appeal as
a quick and easy risk reduction measure. While it is recognised that training and information
have an important role to play, it is also widely acknowledged that it will not be particularly
effective in reducing manual handling problems if the design of the workplace, the loads
handled and what operators are asked to do still result in awkward handling postures.
A third of the organisations employed selection procedures to identify individuals not
physically suited to manual handling. It would be interesting to analyse what selection
criteria organisations employed, as the most likely factors such as strength, age and gender
have been shown in the scientific literature not to correlate particularly well with reducing
manual handling risks. On the other hand, some factors such as previous back problems and
long absences from work due to holidays and illnesses have shown a positive relationship.
However, this level of detail of information was not collected during the survey.
There was a strong trend related to company size, with larger organisations more likely to
provide training in good handling techniques. However, this may be due to the fact that many
of the larger organisations such as Local Governments, NHS Trusts and Fire Brigades have
large numbers of staff employed in jobs where handling training has traditionally been an
integral part of general training

Site Visits
The outcome of the site visits showed that like the postal survey results organisations were
adopting a wide range of ergonomic measures to reduce manual handling risks. The extent of
risk reduction measures was influenced by non-manual handling issues such as new
76 KM Tesh

technology, process efficiency and general investment. The level of manual handling
assessments, in terms of job coverage and detail contained within the assessments were
lacking in some cases. Also there was opportunities to do more in reducing manual handling
risks in terms of using generic and collective approaches as advocated in the Regulations.

Conclusions
The results of the postal survey, with an overall expected response rate of 30%, showed that
the range of measures reported for reducing manual handling risks was encouraging and that
the ergonomic approach advocated was being taken on board. The most common risk
reduction methods employed by organisations were: reducing lifting from floor level or from
above shoulder level under the ‘task’; providing more load information under the ‘load’;
improving workplace layout under the ‘environment’ and providing manual handling training
techniques under the ‘individual and training’ section.
While the site visits confirmed a wide range of risk reduction steps, the practical
implementation of some of these measures was not always effective. The coverage and
quality of the assessments and hence the risk reduction strategies adopted were also a
concern. What appeared less obvious were the arrangements in place to ensure that the
monitoring, auditing and co-ordination of these risk reduction methods was being carried out
to ensure that measures were being effectively implemented.

Acknowledgements
The author would like to thank the HSE for funding this research and to co-workers at the
IOM who contributed to the main research project from which this paper is taken.

References
Council of the European Communities (1990). Council Directive of 29 May 1990 on the
minimum health and safety requirements for the manual handling of loads where
there is risk particularly of back injuries to workers. (Fourth individual Directive
within the meaning of Article 16(1) of Directive 89/391/EEC). Official Journal No.
L156/9–13 (90/269/EEC).
Health and Safety Commission (1982). Consultative Document: Proposals for Health and
Safety (Manual Handling of Loads) Regulations and Guidance Notes. London:
HMSO.
Health and Safety Executive (1992). Manual Handling. Manual Handling Operations
Regulations 1992. Guidance on Regulations L23. London: HMSO.
Tesh KM, Symes AM, Graveling RA, Hutchison PA, Wetherill GZ (1992). Usability of
manual handling guidance. Edinburgh: Institute of Occupational Medicine (IOM
Report TM/92/11).
Tesh KM, Lancaster RJ, Hanson MA, Ritchie PJ, Donnan PT, Graveling RA (1997).
Evaluation of the Manual Handling Operations Regulations 1992 and Guidance HSE
Contract Research Report No. 152/1997 Sudbury: HSE Books.
CONTROL OF MANUAL HANDLING RISKS WITHIN A
SOFT DRINKS DISTRIBUTION CENTRE

Liz Wright1 and R A Haslam

Health and Safety Ergonomics Unit


Department of Human Sciences
Loughborough University
Leicestershire
LE11 3TU

This paper describes an investigation into the presence of manual handling


risks, and measures put into place to control these risks, within a large soft
drinks distribution centre. Company risk assessments had identified risk
associated with handling activities and described training as the control.
Postures were analysed using OWAS, and the NIOSH equation was used to
estimate levels of risk. The manual handling training programme was
evaluated by comparing content with recommended criteria taken from the
literature. Manual handling risks were found in both warehouse and delivery
areas, some being classed as “excessive” using the NIOSH calculation. The
study recommended other means of addressing manual handling risks.

Introduction
Injuries, particularly to the back, resulting from handling activities produce a significant
proportion of reported injuries (HSE, 1992). There is a high reported rate of back injury
among warehouse workers (Ljungberg et al, 1989) with increasing levels of back injury
within the drinks industry; professional drivers have also been shown to have a relatively high
prevalence of musculoskeletal injury (van der Beek et al, 1993). Manual materials handling
(MMH) tasks can be evaluated using various well documented approaches: biomechanical,
psychophysical, epidemiological and physiological (Mital et al, 1997). These approaches
have helped determine the primary risk factors involved with manual handling and resultant
musculoskeletal injuries. This study reviewed the presence of manual handling risks to
warehouse operatives and delivery drivers within a soft drinks distribution centre, and
evaluated control measures. Risk assessments had been performed by the organisation in
response to the Management of Health and Safety at Work Regulations 1992, and had
identified manual handling as a problem.

1
Now at Human Applications
139 Ashby Road, Loughborough, Leicestershire, LE11 3AD
78 EJ Wright and RA Haslam

Training was specified as the control measure, and the company specifically requested an
evaluation of their training programme. The company had also attempted to reduce the effects
of MMH by other means such as raising some of the pallets to a more acceptable height, but
had not assessed the effects of these changes.
There have been numerous case studies investigating MMH in a variety of work situations
(such as Burdorf and Vernhout, 1997; Hickson and Megaw, 1994; Vessali and Kothiyal, 1997)
in most cases using a combination of two or more methods. An interview based questionnaire
was used to obtain worker information on the presence of musculoskeletal disorders and
opinions on training. Postures of both groups were analysed using the Ovako Working
Posture Analysing System, and levels of risks in different situations were estimated using the
NIOSH equation and HSE guidelines.

Methods
The study included two groups of employees; warehouse operatives and local delivery service
(LDS) drivers. Within the warehouse is a “break bulk” area, where cases of soft drinks are loaded
onto LDS lorries. Operatives drive powered pallet trucks to locations in the break bulk area, and
select, or “pick”, cases either onto pallets or into cages, which they then take to the loading area.
Lorries are loaded with either cages or pallets of product for delivery to small and medium retail
outlets and drivers manually unload cases at the customer location. The study therefore involved
two work methods in the warehouse and delivery, as well as two groups of workers.

Semi-structured interviews
A questionnaire was used as a framework for semi-structured interviews conducted with 19
warehouse operatives and 12 drivers, selected during different shifts, over a two week period.
This included questions on training attendance and opinions, work experience and job
satisfaction, as well as a section derived from the Nordic questionnaires (Kourinka et al,
1987) on musculoskeletal disorders.

Ovako Working posture Analysing System, OWAS


OWAS (Louhevaara and Suurnakki, 1992) postural analysis is a method devised to classify working
postures of the back, arms and legs, giving an estimation of the potential for musculoskeletal
injury. The position of body components and activity being performed are coded and recorded at
timed intervals. Initial observations of workers identified the specific activities. A sample period
of 20 seconds was used following pilot trials. Warehouse operatives were observed from the start
of picking onto a new cage or pallet until completion. Drivers were observed from the start of
unloading lorries to the point where they begin to wheel product onto premises.

National Institute of Occupational Safety and Health (NIOSH) Equation


The revised NIOSH equation (Waters et al, 1993) was used to determine level of risks associated
with some MMH tasks. The calculation considers factors including horizontal and vertical distances,
frequency of lifting, asymmetry and coupling (grip) to determine the Recommended Weight Limit
(RWL). A “load constant” of 23 kg is used as a maximum, which is then altered by multipliers
according to the specific lifting conditions. The Lifting Index (LI) compares the actual weight
being handled with RWL, providing an estimated level of risk, with probability of low back pain
increasing as the LI increases (Waters et al, 1997). The developers of the NIOSH equation agree
that “many workers will be at an elevated risk if the LI exceeds 3.0” (Waters et al, 1993).
Manual handling risks in a soft drinks distribution centre 79

Training evaluation
A list of topics to be included in training was developed from literature (see table 1) and was
used to evaluate training. Training is provided by a number of workers all of whom have
completed a five day course provided by a consultancy. Training sessions and information
given to those who attend is based on that provided by the consultants.

Table 1 List of criteria used to assess the manual handling training


(Based on Birnbaum et al, 1993; Chaffin and Andersson, 1991; Chaffin et al, 1986;
Chartered Society of Physiotherapy (CSP), 1994; HSE, 1992b; Kroemer, 1992; Troup and
Edwards, 1985)

Results
Semi-structured interviews
The most frequently reported musculoskeletal disorder over the last year was that of lower
back problems, reported by 47% of warehouse operatives (n=19) and 58% of drivers (n=12).
Knee problems were reported by 50% of drivers and 32% of warehouse operatives. 50% of
drivers also reported neck trouble. Of those reporting back trouble, 22% (7 workers) report
having been absent from work because of it during the last year, 16% report changing duties
(for example carrying out light duties for a period of time) and half claim to have reduced
their activities at home or at work.

Ovako Working posture Analysing System, OWAS


The OWAS categories were used to study combinations of postures and individual back
postures. Differences in postural combinations and in individual back postures were analysed
using Chi square test on observations in each group with the null hypothesis “there is no
difference between cages or pallets with respect to proportions of harmful postures observed”
(in either warehouse or delivery). Harmful postures were looked at relative to the activities in
which they occurred. A minimum of 100 observations is recommended for each job or task
(Louhevaara and Suurnakki, 1992).
A total of 531 observations were made of operatives picking. There were significant
differences (p<0.01) between using cages and pallets, between both postural combinations
and between individual back postures, with fewer harmful postures using cages. A total of
603 observations were recorded on delivery. There was a significant difference between
postural combinations (p<0.05) in favour of cages but the difference was not significant when
comparing individual back postures.
80 EJ Wright and RA Haslam

When using pallets, the activities which had the highest proportions of harmful postures were
lifting and lowering and reaching for products and climbing on and off the lorry.

National Institute of Occupational Safety and Health (NIOSH) Equation


Calculations were made to provide estimations of risks associated with various other
conditions, including different worker techniques and effects of raised areas. A LI of less than
1 was possible in some areas when lifting close to the body, with no twisting. Increasing the
horizontal distance when lifting from or to the back of a pallet or cage, significantly reduced
the RWL (LI of 3.1 and 2.4 respectively). The reduction was slightly less when using cages
due to their smaller depth. Worker posture reduced RWL when twisting (11.82 kg to 10.17
kg) or not being close to the load. When lifting crates, workers cannot slide the load close to
them; lifting crates from the back to the front of a pallet resulted in a LI of over 4.

Training
Comparison of the training session content with the criteria is shown in table 2. A large number
of interviewees described the session as “interesting”, “good”, “informative” and “thorough”,
increasing their awareness of lifting and how the body works. The training concentrated on a
particular technique without discussing realistic approaches to actual working conditions.

Table 2 Results of comparative evaluation

Coverage: “Good” if area covered fully, “Satisfactory” if some aspects are


covered, “Poor” if this area not covered.

Discussion
The organisation had performed risk assessments as part of the requirements under the
Management of Health and Safety at Work Regulations (1992) and these identified manual
handling as a high risk activity. More detailed manual handling risk assessments had not
taken place.
Manual handling risks in a soft drinks distribution centre 81

There had been attempts to control these handling risks, but this study concluded that such
risks were still present. Training was identified by the organisation as the primary control
measure. The study emphasised the need for training to be used as a secondary control
measure and recommended other control methods including the introduction of equipment
and changed work practices. Recommendations were also made regarding the training itself,
primarily to ensure that it was relevant for specific work conditions.
The use within the study of more than one methodology was an important means of
identifying some of the complex factors involved in MMH. The OWAS and NIOSH methods
in some instances resulted in conflicting recommendations. Consideration of workplace
factors as well as actual worker postures, along with subjective information, allowed realistic
recommendations of means of reducing manual handling risks to be delivered.

References
Burdorf, A and Vernhout, R, 1997, Reduced physical load during manual lifting activities
after introduction of mechanical handling aids” In Seppala, Luopajarvi, Nygard and
Mattila (ed.) Proceedings of 13th Triennial Congress of the International Ergonomics
Association, June 29th-July 4th, Tampere, Finland. Vol 4 Musculoskeletal Disorders
and Rehabilitation.
Health and Safety Executive, 1992, Manual handling: guidance on regulations (HMSO,
London)
Hickson, J and Megaw, T, 1994, An example of job redesign in a large automotive plant,
Contemporary Ergonomics 1994, (Taylor and Francis, London) 400–5.
Kuoririka, I, Jonsson, B, Kilbom, A, Vinterberg, H, Biering-Sorensen, F, Andersson, G,
Jorgensen, K, 1987, Standardised Nordic Questionnaires for the analysis of
musculoskeletal systems, Applied Ergonomics, 18, 3, 23–237
Ljungberg, A.S, Kilbom, A and Hagg, G.M, 1989, Occupational lifting by nursing aides and
warehouse workers, Ergonomics, 32, 1, 59–78
Louhevaara, V and Suurnakki, T, 1992, OWAS: a method for the evaluation of postural load
during work, Training Publication II. (Institute for Occupational Health. Helsinki,
Finland)
Mital, M, Nicholson, A.S and Ayoub, M.M, 1997, A guide to manual materials handling
(Taylor and Francis, London)
Van der Beek, A.J, Bruijns, P.W, Veenstra, M.S and Frings-Dresen, M.H.W, 1993, Energetic
and postural workload of lorry drivers during manual loading and unloading of goods.
In Marras WS, Karwowski W, Smith JL and Pacholski L (eds.) The Ergonomics of
Manual Work
Vessali, F and Kothiyal, K, 1997, A case study of the application of revised NIOSH guide
and OWAS to manual handling in a food processing factory. In Adams (ed.)
Proceedings of the 30th Annual Conference of the Ergonomics Society of Australia.
Sydney, 1994
Waters, T.R, Putz-Anderson V, Garg A and Fine L, 1993, Revised NIOSH equation for
development and evaluation of manual handling,. Ergonomics, 36, 749–70
Waters, T, Baron, S, Haring-Sweeney, M, Piacitelli L, Putz-Anderson, D, Skov, T and Fine,
L, 1997, Evaluation of the revised NIOSH lifting equation: a cross sectional
epidemiological study. In Seppala, Luopajarvi, Nygard and Mattila (eds.)
Proceedings of 13th Triennial Congress of the International Ergonomics Association,
June 29th-July 4th, Tampere, Finland. Vol 4 Musculoskeletal Disorders and
Rehabilitation. Finish Institute of Occupational Health.
TRAINING AND PATIENT-HANDLING: AN
INVESTIGATION OF TRANSFER

J A Nicholls and M A Life

Ergonomics and HCI Unit


University College London
26 Bedford Way
London WC1H OAP

This paper describes an exploratory study comparing novices compliance


with patient-handling training in the taught classroom setting with their later
compliance in the workplace. Their workplace performance was also
compared to that of experts. Reasons for non-compliance were explored.
Novice workplace performance was significantly worse than classroom
performances and worse than the workplace performance of experts. The
differences were, in part, attributable to the failure of the practical element of
the programme to support workplace handling. Experts were more likely to:
decompose the task into subunits; think in advance; respond to unexpected
events. It is suggested that training needs to provide more support to
novices’ acquisition of practical knowledge. Such training would enable
development of a set of advance planning behaviours to enhance
performance in diverse clinical situations.

Introduction
Little consensus exists regarding the effectiveness of training in patient-handling (Wood,
1987; Videman et al 1989). Much training design has viewed handling as a variant of simple
lifting. Hence, training has been based on a physical model of the trainee, relying on the
assumption that the acquisition of a set of lifting skills will produce safe handling in the
clinical workplace. Most studies of training effectiveness have examined performance in
either the classroom (e.g. Troup and Rauhala, 1987) or workplace (Takala and Kukkonen,
1987), but few have investigated both the acquisition of handling skills and their subsequent
application. Furthermore, handling patients is a complex task which requires both overt
behaviours and associated mental behaviours. The presumption that ineffective training may
be attributable to trainees’ lack of ‘physical’ skill, has led to scant attention being paid to the
required mental behaviours such as planning and decision making during a lift. The general
aim of this study was to explore the existence and nature of the problem of ineffective
training. Specific aims were: 1) to determine whether novice performance differs in the
classroom and workplace; 2) to compare novice and expert workplace performance and
behaviours; and 3) to consider the adequacy of the design of the training programme.
Training and patient-handling 83

Methods and materials


The study was designed to explore the problem in as naturalistic a manner as possible. The use
of a novice group of subjects in both the classroom and workplace, and of the expert group in
the workplace, provided some level of control over the variable of lifters’ knowledge. A between
groups comparison of novices’ and experts’ workplace performances was possible and a within
groups comparison of novices’ classroom and workplace performances.
To assess performance, a checklist was developed based on that previously used by
Chaffin et al (1986). Raters were asked to assess subjects’ (Ss) performance with respect to 5
criteria concerning: closeness of load, erectness, smoothness, avoidance of twisting and
adequacy of grip. In health workplaces, optimum body postures often cannot be assumed
because of environmental constraints so the criteria in the checklist were all defined in a
relative sense using the qualifier ‘as possible’.
A 16-item questionnaire was developed to facilitate exploration of the relative adequacy
of the lecture and practical parts of the programme, and to provide a basis for understanding
any differences in classroom and workplace performance a 16-item questionnaire was
developed.

Procedure
Eleven undergraduate physiotherapy students and eleven expert therapists participated in the
study. Each expert had more than 3 years, full time, post-qualification clinical experience.
Novices were videoed whilst performing a handling task on a simulated patient in the
classroom following completion of training. The training was based on the teaching of basic
principles reflected in the, ‘straight back, bent knees’ profile. The class setting permitted
application of the principles of handling as taught. As part of the assessment Ss were
questioned to test their lecture knowledge and were scored using a 1–5 scale with a score of 3
representing an acceptable level of knowledge. The videorecordings were subsequently
assessed by two independent raters using the performance checklist. Each rater had more than
10 years experience of making observational assessments of students’ performance. Each
criterion was scored on a 0–3 scale, so performance scores between 0 and a maximum of 15
could be achieved. To pass the classroom assessment Ss had to score at least 60% on the
practical part of the assessment, i.e. a checklist score of 9 or more.
Ss were videoed again two years later, whilst performing two handling tasks during the course
of their normal working routines, on a working day chosen at random. The observer followed Ss
as unobtrusively as possible. To try to ensure as rich a capture of data as possible, the observer
also noted the behaviours carried out by a subject in association with the task performed (e.g.
clearing the workspace). Following completion of the handling task, Ss were interviewed.
Assessment of Ss behaviours was made from the videos and from the observer’s notes.

Data Analysis
Each video was assessed by one of two independent raters on two separate occasions. The raters
were trained by exposing them to videos of ‘good’ and ‘poor’ performance in order to establish
a consistent set of criteria. Each rater judged each subject’s performance, in the classroom and
the workplace, using the performance checklist. The behaviours were informally classified from
the videorecordings and experimenter’s notes on an arbitrary temporal basis into one of three
stages: the preparatory stage (i.e. activities prior to actual change of a patient’s position), the
84 JA Nicholls and MA Life

rifting stage (i.e. activities concerned with changing a patient’s position), and the post-lifting
stage (i.e. activities subsequent to a patient being moved).

Results
It was considered that the data met the requirements for the use of parametric analyses (Huck
and Cormier, 1996). Data were analysed for a total of 33 tasks representing data analysis of 1
task performed by each expert and novice in the workplace and 1 task performed by each
novice in the classroom.

Performance scores
Reliability of checklist: Intra-class correlations were used to estimate the inter and intra-rater
reliability of the checklist. All of the scores were significantly associated (r=0.92). Classroom
performance scores of novices: The mean performance score achieved by the novices was
11.91 (sd=1.37). Subjects’ lecture knowledge was also found in all cases to be of an acceptable
standard (mean=3.45, sd=0.52). Workplace performance scores of novices and experts: The
mean performance score achieved by the novices was 8 (sd=3.16). To test the hypothesis that
there would be a significant difference between novices’ class and workplace scores, a Student’s
t-test was used. This revealed a significant difference (t=4.87, df=10, p=0.0007) indicating a
decline in performance in the workplace. The mean performance score achieved by the experts
in the workplace was 12.91 (sd=0.87). A two sample t-test revealed a significant difference
between novices’ and experts’ workplace scores (t=3.4, df=20, p=0.004) suggesting that novices’
performance, might be amenable to improvement.

Workplace behaviours
In order to explain the performance differences, subject behaviour was examined in detail.
Many behaviours were common to novices and experts. For example, Ss in both groups
communicated with the patient and assessed the clinical situation. In the preparatory phase,
experts seemed more likely than novices to: decompose the task into sub-units; clear the
workspace of relevant objects, (i.e. those likely to interfere with performance of a task); adjust
the bed; arrange pillows on the bed; position the patient’s feet; and check that the patient was
free of encumbrances to movement. During the lifting phase, experts appeared to be more likely
to: move the patient to the edge of the bed/chair; continue instructing the patient during the
manoeuvre; respond to unexpected events. Conversely, novices appeared less likely to: break
the task down into sub-units; and clear the workspace appropriately, e.g. novices tended to
reposition objects, the position of which, was irrelevant to the execution of a lift. Novices also
seemed to be less able to respond to unexpected events. There appeared to be fewer differences
in the behaviours performed by novices and experts in the actual lifting phase, although as the
performance scores showed, the quality of execution of the behaviours differed markedly.

Reasons for novices’ non-compliance in Workplace


No problems were identified regarding the lecture knowledge components of the programme.
The main problems concerned the practical aspects of the programme; nine novices identified
insufficient practical time, and eight reported difficulty in remembering the practical parts.
Just under half of the subjects reported having a poor understanding of the practical parts.
Most Ss claimed that their ability to handle had improved since their classroom assessment
and almost all claimed to have been motivated when being assessed in the classroom.
Training and patient-handling 85

Discussion
This exploratory study aimed to expose the existence of a problem concerning training in
handling by examining the performance and behaviour of novices and experts. Overall, the
results indicate marked differences between novices performances in the classroom
compared to the workplace, and between novices’ performances in the workplace and those
of experts. Differences were also found between the behaviours of novices and experts,
confirming the existence of a problem.

Performance scores
In general terms, adequate performance in the class followed by failure to perform adequately
in the workplace suggests a problem with transfer of the knowledge acquired during training.
All the novices achieved reasonably high classroom performance scores indicating that, in this
study, as in Videman et al’s (1989) investigation, the taught techniques could in fact, be learnt.
The observed level of classroom performance is, however, markedly higher than that found by
Troup and Rauhala (1987) and Videman et al (1989) when using a single three-point scale.
Although Videman et al (1989) found a significant difference between the performances of
trained and untrained subjects, the mean score of the trained group failed to reach 50% of the
maximum. It could be argued that, in the present study, each of the criteria upon which the
performance checklist is based are related, leading to a subject’s performance being overassessed.
However, the size of the comparative difference makes it seem unlikely that it can solely be
attributed to one of possible overassessment. It seems more likely that Videman et al’s (1989)
low score suggests either unrepresentatively poor subject groups or possibly an inappropriate
scoring technique. Whereas our assessment of performance was derived from five relatively
circumscribed subscores Videman et al (1989) used a single measure of performance which
incorporated assessment of less easily defined aspects of performance, (e.g. how well a lift was
planned). Experts’ workplace scores, whilst still ‘imperfect’, were significantly better than the
scores of novices. Methodological issues aside, the main implication of the difference is that the
deficiency in novice performance might be amenable to being reduced. To explore this further,
it was necessary to consider the different ways in which novices and experts behave.

Workplace behaviours and reasons for novices’ non-compliance


Although as standardised as possible, the recording and classification of behaviours was
subjective and this must be borne in mind when considering the behavioural data.
Nevertheless, in comparison to novices, experts’ behaviour suggested two characteristics:
experts appeared to organise the task across a longer time horizon and they seemed to be
more likely to respond to unexpected events. Experts evidenced behaviours which seemed to
indicate that they had thought through their intended plan of action, e.g. in the case of a
catheterised patient transferring from bed to chair, experts were more likely to check that the
catheter was free to move. More detailed forward planning was also suggested by the fact that
the experts were more likely to explain the task to the patient by breaking it down into
subsections. In contrast, novices were, for example, less likely to clear the destination area
suggesting that they had not thought through the intended plan of action in the same way as
experts. This was supported by the fact that, unlike the experts, novices tended to explain the
task to the patient in terms of the overall aim rather than by breaking the task down into
86 JA Nicholls and MA Life

subsections. Novices’ apparent inability to break the task down into sub-sections in advance
suggests a bottom-up strategy of problem-solving with little advance planning of the
consequences of their current actions. Such expert-novice differences have been observed in
other domains (e.g. Chi et al, 1983). Novices also seemed less able to respond to unexpected
events, resulting in their frequently being found in situations where, for example, the chair
was too far from the bed. Experts seemed less likely to be ‘stranded’ by a poorly positioned
chair and, if this did happen, they were more likely to be able to respond. Either the experts
were more able to predict likely problems and so prevent them, or else their strategies for
action were sufficiently flexible to allow them to respond to the unexpected.
Reflecting back to the training programme, novices may be deficient in the high level
knowledge required to plan the sequencing of their movements, and to adjust this planning in
response to the changing situational demands. Novices’ questionnaire responses suggested that
poor workplace performance concerned those components of the programme associated with
the provision of practical knowledge. The suggestion is that it is these elements which need
redesigning or extending. The most clear-cut finding was that the amount of practice was
insufficient, leading to the practical knowledge being inadequately acquired and so failing to
support handling in the workplace. It may be not that the programme fails to promote the
acquisition of practical knowledge per se, but that it fails to foster its acquisition at a sufficiently
high level to support the development of a flexible set of behaviours which can be recruited in
a novel situation. This seems to leave novices with little option other than to perform the task as
best they can, in other words, to rely on the low level knowledge that they have acquired.
In conclusion, this study has demonstrated a problem concerning novices’ use of taught
handling skills in the workplace, and has identified weaknesses in the training. We suggest
that it is reasonable to attempt to reduce the problem by enhancing practical training so that
novices’ ability to consider their behaviours in advance is supported.

References
Chaffin, D.B., Gallay, L.S., Wooley, C.B. and Kuciemba, S.R. 1986, An evaluation of the
effect of a training program on worker lifting postures. International Journal of
Industrial Ergonomics 1 127–36
Chi, M.T., Glaser, R. and Rees, E. 1983, Expertise in problem solving. In Sternberg, R.J.
(Ed) Advances in the psychology of human intelligence, (Vol 2. Lawrence Erlbaum.
Hillsdale, New Jersey).
Huck, S.W. and Cormier, W.H. 1996, Reading statistics and research. Harper Collins. NY.
St Vincent, M. and Tellier, C. 1989, Training and handling: an evaluative study. Ergonomics
32 191–210
Takala, E.P. and Kukkonen, R. 1987, The handling of patients on geriatric wards. Applied
Ergonomics 18 17–22
Troup, J.D.G., and Rauhala, H. 1987, Ergonomics and training. International Journal of
Nursing Studies 24 325–30.
Videman, T., Rauhala, S., Lindstrom, K., Cedercruetz, G., Kamppi, S., Tola, S. and Troup, J.
1989, Patient handling skill, back injuries and back pain. An intervention study in
nursing. Spine 14 148–56
Wood, D.J. 1987, Design and evaluation of a back injury prevention programme within a
geriatric hospital. Spine 12 77–82
Risk Management in Manual Handling
for Community Nurses

Pat Alexander

Back Care Adviser


Herts Handling Training and Back Care Advice
36 Barlings Road,
Harpenden, Herts, AL5 2BJ

A mixed method study was set up to evaluate a risk management programme


using a quantitative survey to both managers and operational staff, and to
explore risk taking behaviour in community nurses with reference to manual
handling practices. Nurses obliged to work in situations where managers’
recommendations had not been implemented were 3 times more likely to
have taken sick-leave for back/neck pain in the last 12 months. Different
perceptions of risk by managers and operational staff were revealed, which
could be addressed by joint training, or consultation.

1 Introduction
In the past physiotherapists have often trained carestaff in the moving and handling of
patients. In 1992 the author was asked to set up a training department for a Community Trust,
which would also train staff from outside agencies as an income generating activity.
In view of the emphasis of the forthcoming European initiative on manual handling
leading to the Manual Handling Operations Regulations 1992 (MHOR 1992), a strategy was
devised that trained all Community Nurse Managers in Risk Assessment and Reduction, as
well as a manual handling programme for all hands-on community nurses. A form for
recording risks and recommendations for manual handling was compiled in 1993 and a
programme implemented to raise awareness of risk, and educate staff in good practice. It was
expected that financial savings would result from a decrease in litigation and sick leave for
musculo-skeletal problems. However, the Trust method of recording sick leave did not allow
access to these data, and it is well-known that there is under reporting of accidents in the NHS
(National Audit Office, 1996). Thus it was decided that a survey of self-reported back/neck
pain would be necessary to establish a baseline to evaluate the effect of the programme.
It was anticipated that due to the Care in the Community Act (1990) the level of
dependency of patients nursed in the community would increase. In order to reduce the effect
of confounding variables such as an increased input from Social Services reducing the
amount of hands-on nursing required from Trust staff, the survey included questions on
changes seen to have influenced community nursing.
88 P Alexander

2 Methods
In order to explore the hypothesis a quasi-experimental survey was conducted, followed by
semi-structured interviews. Measurements taken were the amount of self reported sick leave
taken for back/neck pain from the data concerning the hands-on nurses, and the Manager’s
ability to implement her own recommendations for risk reduction in manual handling
strategies.
As Trust sickness data and accident reports were considered either not accessible or
reliable, two anonymous postal surveys were conducted among 61 Community Nurse
Managers and 165 hands-on community nurses. The two questionnaires were based on the
Witney Back Survey (Harvey, 1985), with additional questions on risk assessment and related
themes. A response rate of 69% was obtained from the managers and 55% from the hands-on
staff. The results of the two questionnaires were analysed and interviews conducted of
volunteer subjects from management and hands-on staff from a range of geographical areas in
the Trust.

3 Results
The managers’ questionnaire showed the following factors were believed to prevent
implementation of their recommendations for safer practice.

Fig 1 showing
Factors preventing implementation of manager’s recommendations
Risk management in manual handling for community nurses 89

The hands-on staff showed a different perception of the factors influencing their back/
neck pain.
Table 1 showing
Factors perceived by hands-on community nurses to influence back/neck pain

Lifting heavy patients 40.9% n=38


Stooping 68.8% n=64
Space Constraints 61.2% n=57
Lifting boxes, etc 19.3% n=18
Driving 30.1% n=28
Problems in patient’s home 44.0% n=41

The data relating to the epidemiology of back/neck pain were only collected from the
hands-on nurses. The lifetime prevalence of back/neck pain was found to be 83% (n=78), the
point prevalence to be 32.2% and the annual prevalence 72%. There was a significant
relationship between the number of hours worked and the annual prevalence of back/neck
pain (p=5% with 1 degree of freedom).

Table 2 showing
Significant association between the number of hours worked and the annual
prevalence of back/neck pain (p=5%, 1 degree of freedom)

There was also a significant relationship (p=5% with 1 degree of freedom) found between
those nurses who had taken sick leave for back/neck pain in the last 12 months and working
where the recommendations made by their manager had not been implemented.

Table 3 showing
Significant relationship between non-implementation of manager’s recommendations
and sick leave due to back/neck pain in last 12 months
(Chi squared test, p=5%, 1 degree of freedom)
90 Pat Alexander

4 Discussion
The nurses in this study, contrary to many of those in other studies (Buckle, 1987), showed no
significant relationship between lifetime, annual and point prevalence of back/neck pain and
age or length of time as a nurse. However, this group was small compared to other studies.
Many of those studies used different time bands for prevalence and a multiplicity of
methodologies, and are therefore not easily comparable. Nor is back pain itself an easy
problem to define, being a symptom rather than a disease. Recent research (Knibbe and
Friele, 1996) shows similar findings of lifetime prevalence of back/neck pain amongst Dutch
community nurses. Their findings of 87% compare with the 83% revealed in this study.
One of the most powerful results shown is that a nurse obliged to work in a situation
where her manager has not been able to implement her own recommendations for safe
practice is 3 times more likely to have taken sick leave for back/neck pain in the last 12
months. This finding alone shows the importance of a thorough assessment of the risks of
manual handling and emphasises the necessity of implementing the recommendations, or
altering the package of care delivered until such time as a safe system of work is in place.
Research shows that nursing carries an increased risk of work-related back/neck pain
(HSE, 1992). Entering the nursing profession itself could thus be seen as an exercise in risk-
taking behaviour, but many people believe in their own personal immunity (Adler et al,
1992). The zero-risk theory postulates that despite people aiming to eliminate risky behaviour
this is often not effective, due to faulty perception of risk (Pitz, 1992). Many nurses have been
misled, believing that correct use of technique will protect them from injury (Harber et al,
1988), whereas it is not solely the provision of training but an ergonomic approach that will
improve the situation. Thus their apparent risk-taking behaviour is non-deliberative (Yates,
1992) and linked to lack of up to date knowledge of research based practice. Due to the
apparent increase in the number of very dependent patients in the community, including the
terminally ill, it appears that, despite the increased input from Social Services revealed in this
research, those patients requiring nursing need more complex intervention. It is known that
these types of patients are physically and mentally more demanding to nurse (Hignett and
Richardson, 1995), and nurses may well be prepared to take risks for those patients who have
a limited life expectancy.
Hands-on community nurses believe that stooping and space constraints are their main
problems, but their managers believe that non-availability of hoists and space constraints are
the main problems. Perhaps this is because relatively few of the managers are now
performing much of the practical work of dressings etc, so are not personally reminded of the
risks of holding a static posture. As Adams (1995) states, the decision makers are often
removed from the consequences of their actions.

5 Conclusion
Differing perceptions of risk between managers and staff were revealed in this study. The
MHOR 1992 emphasise that the workforce must be consulted whilst making the handling
assessment; perhaps this could be a joint training issue. Community Nurse Managers must
understand the importance of implementing their prescriptions for safe practice in manual
handling, including the legal and financial implications. Procedural solutions alone, such as
Risk management in manual handling for community nurses 91

training, have a limited effect (Buckle et al, 1992). The findings of this research confirm that
of many others, in that a multi-facetted approach including design and engineering solutions
is essential to reduce the high incidence of work-related back/neck pain in community
nursing.

References
Adams, J. 1995, Risk. London: UCL Press
Adler, NE. Kegeles, S.M and Genevro, J.L. 1992, Risk taking and health. In Yates, J.F. (ed)
Risk-taking behaviour. John Wiley and Sons Ltd.
Buckle, P.W. 1987, Epidemiological aspects of back pain within the nursing profession.
International Journal of Nursing Studies 24 (4) 319–324
Buckle, P.W. Stubbs, D.A., Randle, P. and Nicholson, A. 1992, Limitations in the
application of materials handling guidelines. Ergonomics 35(9) 955–964
Harvey, J. 1985, The Witney Healthy Back Survey; a survey of staff attitudes towards lifting
patients and objects, and the causes of back pain at Witney community Hospital.
Oxfordshire Health Unit:Centre for Health Promotion.
Health and Safety Executive. 1992, Manual handling—Guidance on Regulations London:
HMSO
Hignett, S. and Richardson, B. 1995, Manual handling human loads in a hospital: an
exploratory study to identify nurse’s perceptions. Applied Ergonomics 26(3) 221–226
National Audit Office. 1996, Health and Safety in NHS Acute Hospital Trusts in England.
London: The Stationery Office.
Pitz, G.F. 1992, Risk Taking, design and training. In: Yates, J.F. (ed) Risk-taking Behaviour.
Chichester: John Wiley and Sons Ltd.
Yates, J.F. 1992, Epilogue: In Yates, J.F. (ed) Risk-taking Behaviour. Chichester. John Wiley
and Sons Ltd.
CHILDRENS NATURAL LIFTING PATTERNS:
AN OBSERVATIONAL STUDY

Fiona Cowieson

Department of Health Studies


Brunel University
Borough Road, Twickenham
London TW7 5DU

This preliminary observational study explored childrens natural approaches


to a lifting task. A trained observer viewed sagittal plane videorecordings of
eighteen children lifting a box. Lifting performance was assessed using an
observational checklist and posture was categorised according to criteria for
stoop, squat or semisquat. The videorecordings were digitised to provide
angular (trunk, knee) and distance (between subject and load) data. Fourteen
children stooped and none adopted a squatting posture. It is suggested that
schoolchildren do not naturally lift ‘correctly’ and that there may be a basis
for targeting training at children as young as 7 or below.

Introduction
Since 1992 the provision of training in manual handling for adult workers has been required by
law (HSE, 1992). Mostly, such training encourages people to lift ‘correctly’ by keeping the
back straight and holding the load close to the body. The biomechanical argument for this is
clear. However, there is little and conflicting evidence to suggest that training programmes are
effective in either the reduction of low back pain or in changing working postures (Troup and
Edwards, 1985; Pheasant, 1991). Indeed, everyday observation suggests that adults do not
automatically lift in a ‘correct’ manner. Conversely, anecdotal evidence suggests that children
do in fact adopt a squatting posture when lifting. If children do lift ‘correctly’ there may be a
basis for introducing lifting training at an earlier age before this advantage is lost. Equally, if
children do not lift in a ‘correct’ way it may suggest that training programmes for adults are, by
attempting to replace habits which have developed over a long period of time, doomed to failure.
No studies have investigated childrens natural preferences when lifting yet it seems reasonable
to suggest that greater understanding of childrens lifting preferences may provide information
which could be useful to the future development of training programmes. Hence, this exploratory
study aimed to examine the naturalistic lifting behaviour of prepubertal children.

Method
Subjects
Eighteen children participated in the study (mean age=7.5+/-0.4 yrs; mean Body Mass Index
(W/H 2 ) 22.16+/-6.1). All subjects were right handed. Using personal or parental
Children’s natural lifting patterns 93

questionnaires subjects were screened for any major visual or musculoskeletal disorders
which might affect their lifting behaviour. None of the subjects had received any lifting
instruction in school.

Procedure
This is an observational study of prepubertal children performing a natural lift. All subjects
wore shorts and T shirts. Four markers were attached to the skin overlying: 1) the spinous
process of the 7th cervical vertebrae, 2) the lateral malleolus, 3) the lateral knee joint line and
4) the greater trochanter. Each subject, stood 2 metres away from a cardboard box weighing
682 gm with dimensions 375mm×440mm×125mm. i.e (H×W×D). Using standardised
instructions, each subject was asked to lift and place the box on a table. Each child performed
three lifts with a 30 second rest between each lift. Continuous videorecordings were made
with a camera placed in a standardised position 4 metres lateral to the load in order to obtain a
sagittal view of the lifting posture. No child was permitted to view other children lifting.

Data acquisition
Manual digitisation of the position of each skin marker was carried out by a trained observer
using a real time video frame grabber and a standard computer. The reliability and accuracy
of the digitising system was tested against a series of known angles and found to be highly
reliable and accurate (mean angular error=0.86°+/-0.57°, SE 0.13°).

Angle and distance data


Angle and distance data were obtained from the videos by digitising the position of the four
skin markers together with a fifth point corresponding to the estimated centre of gravity of the
box. The four markers enabled definition of two angles, one representing forward inclination
of the trunk (torso angle) and the other knee posture (knee angle). The horizontal distance
between the subject and the box was determined from the marker on the lateral malleoli and
the digitised fifth point on the box corresponding to the estimated centre of gravity of the box.

Performance data
The Chaffin Lift Evaluation Record (Chaffin et al, 1986) was used to assess subjects’
performance. This provides binary (Y/N) data based on 6 criteria: one vs two-handed lifting,
smoothness of lifting, avoidance of twisting, proximity of load, erectness of trunk, and grip.
Subjects scored 1 point for each criterion they complied with, hence the maximum possible
performance score was 6.
Each lift was categorised as squat, semisquat or stoop. The main criterion used for the
categorisation was the flexion of the trunk relative to the vertical: Squat: trunk flexed up to 30°
from vertical Semi-squat: to 60° and Stoop: trunk flexed beyond 60° from vertical and towards
horizontal. Posture was assessed from the first identified frame in which the box began to be
lifted from the floor. A trained observer familiar with the criterion for rating the videos viewed
each video as often as she wished until satisfied that the lifts had been correctly classified.

Results
Descriptive statistics were used for angular and distance data and performance scores. The
lifting behaviour of each of the eighteen children was consistent between successive lifts.
94 F Cowieson

Squatting was not observed in any of the children so that two categories emerged, semisquat
and stoop. Four children (22%) were classified into semisquat posture and 14 (78%) into the
stoop category. The result as presented in the Table 1 summerise the principal data for each
group. As can be seen, there was little difference between subjects who semisquatted and
those who stooped in terms of their proximity to the load and their performance scores.

Table 1. Torso and knee angles, Horizontal Distance and


Performance scores

Discussion
Squatting, with an almost vertical trunk posture was not observed in any of the children. Most
children (14) adopted a stoop posture, with the trunk flexed toward the horizontal. A minority
(4) adopted semisquat thereby conforming to a safer lifting posture. Clearly, the present study
refutes anecdotal reports of children adopting a squatting posture when lifting. For all of the
children, lifting behaviour was consistent between all three lifts making it seem reasonable to
presume that approaches to lifting are well established by 7 years of age. By giving children
specific instructions to lift a standard box, this study aimed to balance a requirement to
observe childrens natural behaviours with some degree of control over the task requirements.
It could be argued that a truly unobtrusive study method would have been preferable but given
that all of the children consistently lifted in the same manner and as they can be presumed to
be ignorant of ‘correct’ lifting, it seems reasonable to assume that the observed behaviours
were representative of the childrens ‘usual’ approaches to this type of lifting task.
As none of the children adopted a squatting posture it seems clear that children as young
as 7 may, as a consequence of their lifting behaviours, be at risk of potential injury. Sheldon
(1994) has previously suggested that children should be targeted with training related to
manual handling and this study could be considered to provide some support for that line of
Children’s natural lifting patterns 95

thinking. However, the age group which should be targeted remains open to question. In the
present study, the decision to investigate 7 year olds was an arbitrary one based on the
requirement to investigate children who were old enough to respond reliably to instructions.
Yet, most of these children adopted the less safe stooping posture when lifting. At face value
this might suggest that training should be directed at younger children. A prerequisite to the
development of such training is future work to investigate not only the lifting preferences of
younger children but also the influence of task variables on lifting posture.
Difference in the functional posture adopted by individuals performing the same task have
previously been recognised, (Ikeda et al, 1991) so it is perhaps unsurprising that some
children elected a semisquat posture. Clearly, the different postures observed may be related
to individual differences such as anthropometry or strength. Although further work is
required to determine if there is a relationship between anthropometry and lifting posture,
these children may lift in a way which optimises their particular individual differences. If this
is the case one implication might be that training in manual handling for either adults or
children should focus on preserving individual differences in lifting style rather than on the
imposition of a ‘correct’ lifting behaviour.
Finally, horizontal distance was remarkably consistent for subjects in each of the two
categories with most children naturally electing to stand close to the load. If this is confirmed
in larger scale studies it would seem to suggest that instruction to ‘stand close to the load’
may be in fact redundant.

References
Chaffin, D.I.B., Gallay, L.S., Woolley, C.B., Kuciemba, S.R. 1986, An evaluation of the
effect of a training program on worker lifting postures, International Journal
Industrial Ergonomics, 127–136
Health and Safety Executive. 1992, Guidance on Manual Handling Regulations. L23.
Health and Safety Executive. HMSO
Ikeda, R.E., Schenkman, M.L., Riley, P.O., Hodge, W.A. 1991, Influence of age on
dynamics of rising from a chair. Physical Therapy 71 61–69
Pheasant, S.T. 1991, Ergonomics, Work and Health. (Macmillan, Edinburgh)
Sheldon, M.R. 1994, Lifting instruction in children in an elementary school, Journal Orthop
Sports Phys Ther. 19 105–108
Troup, J.D.G. and Edwards, F.C. (1985) Manual handling and lifting: an information and
literature review with special reference to the back. Health and Safety Executive.
HMSO
MANUAL HANDLING AND LIFTING DURING THE
LATER STAGES OF PREGNANCY

T.Reilly and S.A.Cartwright

Research Institute for Sport and Exercise Sciences


Liverpool John Moores University
Mountford Building, Byrom Street
Liverpool, L3 3AF

Women may have to maintain manual handling/lifting activities, in domestic


and/or occupational roles, during pregnancy. In the present study,
observations were made at weeks 24–26 and 36–38 pre-term, and 12–16
weeks after birth (n=7). On 3 successive days, anthropometry, isometric lift
performance and self-selected dynamic lifts (6 lifts/min for 10 min) were
assessed, respectively. Ten controls, matched for age and body size, were
measured over the same time frame. Performances in the isometric lifts, at
knee and at waist height, did not change during later pregnancy. Dynamic
lifting performance was unchanged over this same period, although
perceived exertion increased. Loads handled by the pregnant subjects were
lower than for the control subjects. Performance post-partum improved
compared to measurements during the final trimester of pregnancy.
Perceived exertion post-partum varied for individual body locations more
than for whole-body exertion. It seems lifting performance is not seriously
compromised throughout pregnancy when the load is self-selected and
isometric endurance in particular is improved post-partum.

Introduction
Women may be obliged during pregnancy to maintain handling/lifting activities, whether in
domestic or occupational roles. There is also an increasing commitment among pregnant
women to stay at work as long as possible prior to giving birth and to resume physical work
soon afterwards. Changes occurring during pregnancy include weight gain, alterations in the
centre of mass and body shape, and adaptations in gait. There are also alterations in
ventilation, cardiovascular, oxygen transport and endocrine systems that contribute to the
healthy development of the foetus without compromising maternal requirements.
Manual handling and lifting during the later stages of pregnancy 97

Physiological and anatomical changes during pregnancy are not necessarily detrimental to
physical performance. Previously we have reported that lifting performance (isometric
endurance lift; vertical and asymmetric lifts) was not impaired during pregnancy (Sinnerton
et al., 1993, 1994) up to approaching term. Here we report observations made repeatedly pre-
term over 3 days and follow-up post-partum. For dynamic lifting tasks it was important that
the methods involved self-selection of load, so that the perceived capabilities of the pregnant
women were monitored. Consequently, for the pregnant women all of the tests were at
submaximal intensity. For both the pregnant and control subjects, the data obtained at the end
of the second trimester (weeks 24–26) are used for reference purposes.

Methods
Seven women agreed to participate in this part of a larger study concerned with lifting
performance of pregnant women. The women were aged 32 (±2) years and on their first visit
to the laboratory weighed 64 (±5) kg at week 13. For this study they were measured pre-term
at weeks 24–26 (Stage 4), weeks 36–38 (Stage 5) and again 12–16 weeks after the birth. Ten
women who were matched for age and body size and were not pregnant were measured at the
same time to form a control group.
The measurements were made on three consecutive days on each of the test occasions. All
participants had been familiarised with the procedures. On the first day body mass was
recorded and skinfold thicknesses measured using a skinfold caliper (Harpenden) over
biceps, triceps, subscapular and suprailiac sites. An isometric lift (at both knee and waist
height) was performed on day 2 and vertical and asymmetric dynamic lifts were performed
on day 3. All lifts were performed in the afternoon, at approximately the same time on each
occasion. Subjects wore appropriate clothing and were reminded not to eat, drink coffee or
smoke prior to the performance of the tests. The procedures were approved by both the
University’s and the Liverpool Maternity Hospital’s Ethics Committees.
Performance of the lifting tasks was as previously described (Sinnerton et al., 1993;
1994). The isometric lifts were measured using the dynamometer validated by Birch et al.
(1991). The lifts were performed at both knee and waist height and were adjusted to the
anthropometric requirements of each participant. The criterion was the length of time the
applied force could be maintained within a strictly defined range. This was determined to be
35–45% of a maximum, previously established using the control group.
The dynamic lifting tasks incorporated adaptations to the psychophysical method of
Snook (1978). Participants selected the maximum load (MAL) they believed themselves to be
able to lift repeatedly for 10 min at a frequency of 6 lifts per minute (down and up). The bags,
from which the load was chosen, gave no visual clues as to the amount they weighed. A
standard set of instructions was given to each participant. The weight in the tote box could be
adjusted at any point throughout the lift. The dynamic lift was performed vertically and the
asymmetric lift was through an angle of 90°. The height of the lift was over a fixed distance in
both cases, approximately from waist to knee height.
Immediately following completion of each task, perceived exertion was rated using
Borg’s (1982) scale. The participants were asked to rate the task for general (whole-body)
and localised (muscular, breathing) effects.
98 T Reilly and SA Cartwright

Results
Body mass, measured during the post-partum stage, was significantly less than the body mass
measured in the later stages of pregnancy. Mean body mass showed a decrease in the post-
partum stage (67.0±8.1 kg); this decrease was significant between the post-partum stage and
some of the stages of pregnancy, namely, Stage 4 (74.7±6.7 kg; P<0.05) and Stage 5
(77.0±6.5 kg; P<0.01). When the post-partum body mass was compared with the body mass
of the control group at all stages, the results were similar. The total skinfolds, measured in the
post-partum stage were compared with the total skinfolds during pregnancy; the results were
not significantly different (P>0.05). When the total skinfolds of the post-partum group were
compared with those of the control group, no notable differences were found.
Performances in the isometric lifts, at knee and at waist height, did not change during the
later stages of pregnancy. Generally, the performances were more variable in the pregnant
subjects compared to the reference group of non-pregnant women. The endurance times at
knee height were marginally greater in the control subjects whereas the pregnant subjects
were slightly better in endurance performance at waist height. None of these differences
reached statistical significance (P>0.05). For the pregnant group, the duration at waist height
was greater than at knee height (P<0.01), but this difference was also observed in the control
subjects.
The performance on the dynamic lifting tasks was unchanged over this same period,
although there was an increase in the rating of perceived exertion. This applied to both the
vertical and the asymmetric lifts. Loads handled by the pregnant subjects were lower
(P<0.01) than those employed by the control subjects.
Performance post-partum improved in the isometric lift at waist height and in the dynamic
lifts compared to the measurements obtained during the later stages (final trimester) of
pregnancy. Perceived exertion post-partum varied in all tests for individual body locations
more than for whole-body exertion.

Table 1. Endurance time for isometric lifting at knee height and waist height, and
maximal acceptable lift (MAL) for pregnant and control groups

The post-partum results were similar for the two dynamic lifts (P>0.05). Nevertheless, the
performances by the mothers did not reach the values attained by the non-pregnant control
participants (P<0.05).
Manual handling and lifting during the later stages of pregnancy 99

Perception of whole-body exertion (RPE) during the isometric lift at waist height post-
partum did not differ from that during pregnancy (P>0.05). There was a decrease in the RPE
values for the back from 10±3 at Stage 4 to 8±2 post-partum. The decrease for the lower legs
from Stage 5 (11±3) to post-partum (8±2 was also significant (P<0.05). There were no
significant differences for whole-body RPE post-partum and the values reported by the
control group. Similarly for the dynamic lifts, there were no changes between the later stages
of pregnancy and post-partum for whole-body ratings of exertion. Ratings for the back were
higher at Stage 4 (12±3) and for the abdominals were higher (P<0.05) at Stage 5 (12±3) than
post-partum (9±3). For the asymmetric dynamic lift, the main difference was a decrease in
ratings for the abdominals from 11±3 at Stage 5 to 9±3 (P<0.05) post-partum.

Discussion
The results of Maximum Acceptable Load (MAL) during pregnancy, for both the vertical and
asymmetric dynamic lifts, did not change as pregnancy progressed. This implied that
dynamic lifting performance could be maintained throughout the course of pregnancy.
Although there were no visual clues as to the amount the women were lifting, each subject
was obviously able to select a suitable amount at each stage. However, in contrast to the
findings for the isometric endurance lifts, the results showed that the pregnant group, at all
stages, lifted substantially less than the control group. This is in agreement with the findings
of Masten and Smith (1988), who suggested that non-pregnant are significantly stronger than
pregnant women. One might at first try and explain this difference by the increase in body
mass and the change in shape which occurs as pregnancy progresses, but this explanation
would not be as plausible when applied to the earlier stages of pregnancy. The
psychophysical methodology chosen means it is difficult to state that the pregnant women
were not as ‘capable’; they may well have been capable, but chose not to lift as much. The
purpose of using the psychophysical methodology was to enable the women to exercise an
element of subjective judgement. This is crucial throughout pregnancy when so many
changes are occurring to the woman’s body. Therefore, when the pregnant women were asked
to select a load for the frequency and time period specified, they chose the ‘maximum’
amount with which they felt comfortable. Nicholson and Legg (1986) noted that when
subjects were asked to produce the maximum weight acceptable for a given time the load
subjectively corresponded to “Fairly light” on the Borg scale, highlighting the “acceptable”
notion of the methodology. It may have been the case that the women underestimated this
amount in order to ‘be on the safe side’. Also, for the very reason that MAL was used, that is,
to include subjective judgement, it follows that it tends to be influenced/limited by
psychological factors. Ayoub et al. (1980) considered that the psychophysical methodology
was concerned with the relationship between sensations and their physical stimuli. Previous
research has also found individual interpretation to be a disadvantage (e.g. Snook, 1985;
Mital, 1983). Other factors may influence the amount lifted, such as motivation and previous
experience of lifting tasks. The fact remains that the pregnant women were consistent in their
estimations and maintained their dynamic lifting performance throughout the course of
pregnancy.
The results in both groups of women suggest that there is no difference between the
performance of the vertical and asymmetric lifts. This is in contrast to previous research
100 T Reilly and SA Cartwright

which has shown that the amount lifted vertically tends to be greater than that lifted
asymmetrically (Garg and Badger, 1986). Lack of differences between the dynamic lifts in
this report may again be, in part, inherent in the psychophysical methodology (however, it
must be noted that the 1 RM results did not show variations between vertical and asymmetric
lifts). With the control group the amount lifted as a percentage of the 1 RM was at least 50%.
This figure varied, often influenced by the amount the subjects felt they could lift as a
maximum. These figures did not change significantly over the entire stages of testing,
suggesting that if subjects did tend to under- or over- estimate the amount they were to lift,
they did so consistently over a period of nearing 12 months.
It seems that lifting performance is not compromised appreciably throughout pregnancy
when the load is self-selected. In this study subjective exertion increased whilst lifting
performance was maintained. Self-chosen work-load was unaffected by physiological
changes as the pregnant women neared term and performance improved following the birth.
Further research is envisaged to establish the behavioural adaptations that occur during
pregnancy which permit the maintenance of manual handling performance and the factors
underlying the improvement post-partum.

Acknowledgements
This work was supported by a grant from the Health and Safety Executive.

References
Ayoub, M.M., Mital, A., Bakken, G.M., Asfour, S.S. and Bothea, N.J. 1980, Development of
strength and capacity norms for manual handling activities: the state of the art. Human
Factors, 22, 271–283
Birch, K., Sinnerton, S., Reilly, T. and Lees, A. 1994, The relation between isometric lifting
strength and muscular fitness measures. Ergonomics, 37, 87–93
Borg, G., 1982, Psychophysical bases of perceived exertion. Medicine and Science in Sports
and Exercise, 14, 377–381
Garg, A. and Badger, D. 1986, Maximum acceptable weights and maximum voluntary
isometric strength for asymmetric lifting. Ergonomics, 29, 879–892
Masten, M.Y. and Smith, J.L. 1988, Reaction time and strength in pregnant and non-pregnant
employed women. J. Occ. Med., 30, 451–456
Mital, A. 1983, The psychophysical approach to manual lifting—A verification study. Human
Factors, 25, 485–491
Nicholson, L.M. and Legg, S.J. 1986, A psychophysical study of the effects of load and
frequency upon selection of workload in repetitive lifting. Ergonomics, 29, 903–911
Sinnerton, S., Birch, K., Reilly, T. and McFadyen, I.M. 1993, Weight gain and lifting during
pregnancy. In E.J.Lovesey (ed.) Contemporary Ergonomics (Taylor and Francis,
London), 305–307
Sinnerton, S., Birch, K., Reilly, T. and McFadyen, I.M. 1994, Lifting tasks, perceived
exertion and physical activity levels: their relationship during pregnancy. In
S.A.Robertson (ed.) Contemporary Ergonomics (Taylor and Francis, London),
101– 105
Snook, S.H. 1978, The design of manual handling tasks. Ergonomics, 21, 963–985
Snook, S.H. 1985, Psychophysical considerations in permissible loads. Ergonomics, 28,
327–330
POSTURE ANALYSIS AND MANUAL HANDLING IN
NURSERY PROFESSIONALS

Joanne O.Crawford and Rhonda M.Lane

Industrial Ergonomics Group


School of Manufacturing and Mechanical Engineering
University of Birmingham
Edgbaston
Birmingham B15 2TT

The incidence of back pain in nurses and others involved in handling patients
is well documented. However, this is not the case for nursery professionals.
This study aimed to investigate the prevalence of back pain and poor posture
A modified version of the Nordic musculoskeletal questionnaire was
administered, working postures were observed using OWAS and participants
rated selected work tasks using the Borg RPE-scale. The results indicated
that the prevalence of low back pain is similar to that found in nurses.
Participants also experienced pain in other body sites. The main activities
contributing to poor postures were play activities and meal supervision.
Recommendations include a formal risk assessment of this environment and
education for staff to increase awareness when handling children

Introduction
This study examines the incidence of back pain, body discomfort and working postures in
nursery carers. There is an absence of direct research material in this area, however much of
the research relating to nursing professionals and manual handling is relevant (Stubbs et al,
1983; Baty and Stubbs 1987; Pheasant and Stubbs 1992). Corlett et al, (1993) found that
caring for children can cause a high level of postural stress and lifting children from the floor
is likely to cause and aggravate low back pain. Recommendations were made for those
working with children including checking the height from which the child is lifted, ensuring
the side of the cot is down, kneel or squat to the childs level and carrying the child close to the
trunk (Corlett et al, 1993).
The impetus for this study was the perception that nursery carers and carers of older
children with disabilities were suffering from an increased risk of injuries and postural
problems. Babies and young children tend to occupy low positions on the floor, thus in order
to lift them, nursery carers have to lift below knee level. Although babies and young children
are perceived as small and lightweight, the frequency with which they are lifted is a further
102 JO Crawford and RM Lane

risk factor. The aim of the study was to examine the prevalence of back pain, other areas of
body discomfort and the postures adopted by nursery carers at work.

Method

Participants
Twelve female participants working in two nurseries took part in the study. Eight were
observed in nursery one, five in nursery two… The age of the staff ranged from 18 to 47 years
and length of time working in the nursery environment ranged from 6 months to 21 years. Six
of the participants worked with children aged 6 weeks to 18 months, the remainder worked
with children from 18 months to 5 years. In nursery one staff worked over a 10 hour period
with a one hour break during the day. In nursery two staff worked for an 8 hour period with a
one hour break. The ratio of staff to children was one staff member to three children under 18
months and one staff member to 5 children over 18 months.

Back Pain and Postural Analysis


The questionnaire used to assess discomfort was a modified version of the Nordic
Musculoskeletal Questionnaire (Kuorinka et al, 1987). The modifications to the
questionnaire included an additional section on perceived fitness levels and further
biographical details. The questionnaire was administered to participants in a structured
interview format.
The Ovako Working Posture Analysis System (OWAS) was used to identify and classify
working postures (Kurhu et al, 1977). Participants were observed for 45 minutes each at
intervals of 30 seconds. This allowed 90 data to be collected per participant. Although it is
recommended that data collection be carried out using a video camera, this was not permitted
in the nursery environment. The weights of the children were estimated using growth tables
provided by Samtrock (1994).

Task Evaluation
Nursery staff were asked to rate the main tasks they performed using the Borg Scale
(Borg 1985)

Results

Back Pain
The point prevalence of low back pain was 4 of the sample (33.3%) and the 12 month period
prevalence was 6 of the sample (50%). During the previous 12 months, 16.7% of the sample
reported low back pain as lasting between one and seven days and 8.33% experienced pain on
a daily basis. Table 1 shows the point and 12 month prevalence of pain in various body
areas.

Posture Analysis
The OWAS observations resulted in 1170 data being collected. This was analysed using
MINITAB. The data was divided into two groups to represent carers of babies (under 18
Posture analysis and manual handling in nursery professionals 103

months) and toddlers (over 18 months). The percentage of time spent on each particular work
activity is shown in Figure 1.

Table 1. Point and 12 month prevalence of pain

Figure 1. Percentage of time spent on work activities

There were significant differences between the two rooms (p<0.001). In the baby room
38.9% of the time was spent carrying out activities with the children. In the toddler room
more time was spent supervising meals (27.6% compared with 19.8%).
The distribution of OWAS action levels is shown in Table 2. As can be seen from the table,
the carers observed spent at least 40% of the observation time in postures which require
action to be taken. The main contributing activities to the poor back and neck postures were
play activities in the baby room and meal supervision in the toddler room.

Task Evaluation
Using the Borg Scale, participants rated various work tasks. Nappy changing was rated as 15
(hard, heavy), lifting babies and children, getting toy boxes out and getting nappy baskets out
were rated as 13 (somewhat hard) and play activities were rated as 9 (very light).

Other Observations
During the observation periods at the nursery a number of work practices were noticed. These
included the use of children’s chairs by staff, the use of travel cots (at floor height and the
sides do not drop down), the moving of heavy equipment, e.g., sandboxes (25 kg) without
104 JO Crawford and RM Lane

knowledge of the weight and the high storage of clothing and nappy baskets (176cm from the
floor).

Table 2. Distribution of OWAS action levels during the observation time

Discussion
The prevalence of back pain was similar to that found in Stubbs et al, (1993). Fifty percent of
the sample experienced pain in multiple sites which implies that although the load can be
reduced for the back, this may be removing the pain to other body sites. As found from the
OWAS analysis, participants were spending approximately 40% of their time in AC
categories 2 to 4. Indicating action should be taken in the near future to improve posture and
for some tasks immediately. However one of the difficulties of working in a childcare
environment is that infants who are not capable of standing must be lifted from floor or cot
height. In a childcare environment there is also the difficulty of maintaining an upright
posture, as there is a need to get down to the child’s level to interact. Lifting
recommendations for dealing with children were suggested by Corlet et al, (1993), however,
it must be questioned whether they take account of the repetitive nature of the nursery
carers job.
Most of the equipment in the nurseries studied is designed to fit children under 5 years
old. Nursery carers were using chairs for children and this practice should be avoided as
prolonged sitting in a poorly fitting chair without back support can increase the risk of back
pain (Luopajarvi 1990).
Lifting babies onto changing areas and into high chairs was rated as somewhat hard using
the Borg RPE-scale. Although the use of raised work areas and high chairs does reduce the
need to stoop, the actual lifting of the children to those areas does represent a high risk. The
method used to lift children involves holding the child away from the body to place them in
high chairs. This perhaps reflects a need to examine both the design of high chairs and how
children are placed in them. The use of removable trays on the high chair is also
recommended.
The use of travel cots in the nursery should be avoided. Travel cots are designed to be
portable and short term. Their use as a permanent tool in the nursery means that carers have to
place children into them, reaching over the side of the cot, down to 10cm above the floor. This
risk factor can be avoided by using cots with drop sides that are higher from the floor.
The storage of children’s belongings at higher level is necessary in the nursery
environment. However the present height of 176 cm from the floor is considered
unnecessarily high and involves carers reaching above shoulder height. It would be
Posture analysis and manual handling in nursery professionals 105

recommended that this height be reduced to allow the carers to reach but prevent the children
from doing so.

Recommendations
The nature of the nursery carers job is such that there will always be handling of infants and
bending to the child’s level to interact with them. There are a number of recommendations
that can be made as a result of this study although numbers were small.
A full risk assessment should be made of the tasks that nursery professionals carry out.
This information could then fed back to the carers to increase their knowledge regarding the
weights of children and equipment and how they could change the tasks to reduce the
handling risk. There is also a clear need for nursery staff to be educated in the risks involved.
The workplace also needs to be improved by providing adult sized equipment for staff when
carrying out administration work or taking rest breaks.

Further Research
Recommended further research in this area would be to examine the manual handling training
given to nursery carers and whether this is adequate for the work tasks that they do. There is
also a clear need for ergonomics information to be provided to nursery managers regarding
the purchase and use of equipment such as cots and high chairs.

References
Baty, D. and Stubbs, D.A., 1987, Postural stress in geriatric nursing. Journal of Nursing
Studies, 24, 339–344
Borg, G., 1985, An Introduction to Borg‘s RPE-scale. Movement Publications, Ithaca, New
York
Corlett, E.N., Lloyd, P.V., Tarling, C., Troup, J.D.G. and Wright, B., 1993, The Guide to the
Manual Handling of Patients. National Back Pain Association, London
Karhu, O., Harkonen, R., Sorvali, P. and Vepsalainen, P, 1977, Observing work postures in
industry. Examples of OWAS application. Applied Ergonomics, 12, 13–17
Kuorinka, I., Jonnson, B., Kilbom, A., Vinterberg, H., Biering-Sorensen, F., Anderson, G.
and Jorgensen, K., 1987, Standardised Nordic questionnaire for the analysis of
musculoskeletal symptoms. Applied Ergonomics, 18, 233–237
Luopajarvi, T., 1990, Ergonomics analysis of workplace and postural load. In M.I.Bullock
(ed.) Ergonomics: the physiotherapist in the workplace. (Churchill Livingstone,
Edinburgh)
Pheasant, S. and Stubbs, D.A., 1992, Back pain in nurses Applied Ergonomics, 23, 226–233
Samtrock, J.W., 1994, Child Development. (Brown and Benchmark, Madison USA)
Stubbs, D.A., Buckle, P.W., Hudson, M.P. and Rivers, P.M., 1983, Back pain in the nursing
profession II. Ergonomics, 26, 767–779
POSTURE
CAN ORTHOTICS PLAY A BENEFICIAL ROLE DURING
LOADED AND UNLOADED WALKING?

David C.Tilbury-Davis1, Robin H.Hooper1, Mike G.A.Llewellyn2

1
Human Sciences Dept. Loughborough University, Loughborough, LEICS. LE11 3TU
2
Protection & Performance, Centre for Human Sciences,
Defence Evaluation Research Agency, Farnborough, HANTS. GU14 6TD

Increased knee flexion occurs post heel contact whilst carrying a heavy load.
To establish the influence of knee orthotics that inhibit anterio-tibial
displacement on changes induced by load carriage, ten military subjects
were assessed under four conditions (Unloaded, 20kg load, 40kg load and
40kg load+orthotics). Ankle and knee flexion/extension angular
displacements and velocities were derived. Ground reaction force data and
peak force time parameters were derived. Force data were expressed as
percentage of body weight. Significant differences were found in propulsive
impulse, work and power; as well as vertical impulse, work and power.
These were increased by knee orthotics. Knee flexion during load carriage
was not reduced by orthotics (p>0.05). The possibility that orthotics assist in
carrying heavier loads was not demonstrated while there was an increase in
physiological cost.

Introduction
The physiological effects of load carriage have been well documented and reviewed (Knapik
et al, 1996). The electromyographic activity of certain muscles has also been documented
(Bobet et al, 1984; Ghori et al, 1985; Holewijn, 1990; Harman et al, 1992). Fewer studies
have looked at the kinematic and kinetic effects of load carriage. Increased knee flexion
occurs post heel contact during load carriage (Kinoshita, 1985). Stance phase duration is
unchanged by upto 50% of body weight load carriage, but swing phase duration is decreased
(Ghori et al, 1985; Kinoshita, 1985; Martin et al, 1986). Ground reaction force, braking
force, propulsive force and lateral force, are all increased by increasing load (Kinoshita,
1985; Harman et al, 1992). Although the change in vertical force is not proportional to the
load carried (Harman et al, 1992). The affects on medial force are inconclusive (Kinoshita,
1985; Harman et al, 1992). Hip flexion and knee flexion post heel contact are increased by
load carriage along with anterio-posterior rotation of the foot about the distal ends of the
metatarsal bones (Kinoshita, 1985; Martin et al, 1986). During fixed speed walking the stride
length may shorten, as load mass is increased (Kinoshita, 1985; Martin et al, 1986; Harman
et al, 1992).
Orthotics during loaded and unloaded walking 109

The use of orthotics is widespread in sport and clinical rehabilitation. Increased stability
in lax joints has been shown with the use of knee orthotics and/or prophylactic taping
(Andersen et al, 1992). But these interventions have been shown to affect lower extremity
motion and vastus lateralis electromyographic activity (Cerny et al, 1990; Osternig et al,
1993) as well as increasing intramuscular pressure (Styf et al, 1992). Knee orthotics may also
cause greater extensor torques at the hip and ankle with more work produced at the hip and
less at the knee (DeVita et al, 1996).

Use of knee orthotics designed for anterior cruciate ligament injuries, that inhibit anterio-
tibial displacement or ‘giving’ of the knee joint, might provide support to offset this flexion,
giving a more upright posture during load carriage. Would such orthotics therefore assist
when carrying loads by attenuating the increase in knee flexion?

Methods

Subjects
Data were collected from seven healthy males for the unloaded assessment of the knee
orthotics (mean age 22.7±3.9 years), mean stature and mass being 1.79±0.06m and
75.96±7.04 kg respectively. Data for the loaded conditions were collected from ten further
healthy males, drawn from serving military personnel whose job involved load carriage,
because task experience has been shown to influence task performance (Littlepage et al,
1997; Vasta et al, 1997). (Mean age 24.5±3.5 yrs, mean stature 1.77± 0.07m and mass
78.03±6.66 kg).

Apparatus
The knee orthotics (Masterbracetm3, Johnson & Johnson Orthopaedics) weighed 3.6–3.9 kg
per pair. A Kistler force plate (model: 9281B) was used to record ground reaction force
parameters. Functional landmarks were identified with 5 markers (5th metatarsal head, lateral
malleolus, 0.15m proximal to the lateral malleolus, 0.15m distal to the glenohumeral condyle
and at the glenohumeral condyle). The movements of the ipsilateral limb were filmed in the
sagittal plane (Panasonic F15 camera with lens WV-LZ14/8AFE). Footage was recorded
(Panasonic AG7330). A Peak 5 (Peak Performance Technologies Inc.) video motion analysis
system was used to analyse marker motion. Force data were synchronised with the kinematic
data using a threshold trigger setting of 0.1904V (for all subjects and weights) and a Peak
Performance Event Synchronisation box.

Procedures
Following ethical clearance (DeRA Ethical committee) and informed consent from subjects,
height, weight and leg length were measured. The subjects were familiarised with procedures
and the orthotics. A start point at least 5 paces from the force plate was found and subjects
walked over the plate at a self-selected velocity and continued for a further 5 paces past the
force plate. When the subjects were ready they were asked to walk towards the plate, whilst
focusing on a point in the distance. This was repeated until 5 clean (foot landing mid-plate)
contacts with the force plate had occurred and the data recorded. For the unloaded study all
110 DC Tilbury-Davis, RH Hooper, and MGA Llewellyn

subjects completed 10 right foot contacts (5 unbraced, 5 braced) and 10 left foot contacts (5
braced, 5 unbraced). When carrying loads the right limb only was studied. The starting
condition for each subject was randomised, the sequence being unloaded, 20kg, 40kg and
40kg with orthotics.

Data Analysis
After 2-D reconstruction (filtered using a Butterworth optimal filter) of the raw kinematic
data, the ankle and knee flexion/extension angular data were derived along with their
respective angular velocities. Ground reaction force data were derived from the force plate
along with peak force time parameters. Also the braking/propulsion impulse ratio was
calculated to assess constancy of walking velocity (Hamill et al, 1995). All data were
expressed as a percentage of contact time to normalise the individual time phases and force
data were expressed as percentage of body weight. Peak parameters were statistically
analysed using a T-Test for comparing dependent samples in a repeated measures experiment,
which was corrected for correlated samples to overcome intersubject leg length variation. The
intrasubject coefficients of variation (CoV) were calculated for peak force parameters
(Winter, 1984). The curves were plotted and show the mean±95% confidence intervals, CoV
for the ground reaction force, knee flexion and ankle flexion curves were also calculated
(Winter, 1984). Difference between the control and trial, was accepted where the trial curve
lay outside the 95% confidence intervals of the control.

Results
The orthotics made no significant differences in any angular velocities and displacements
during unloaded gait, nor to the ground reaction forces. The latter are in agreement with those
reported by Chao et al (1983). When comparing the unloaded conditions (braced and
unbraced) maximum medial force, mediolateral power and mediolateral work in the non-
dominant (N-D) leg were significantly different (table 1, p<0.01).
Table 1. Unloaded

During loaded gait medio-lateral, anterio-posterior and vertical work and power were
significantly increased (p<0.01) by wearing knee orthotics. Wearing orthotics increased
medial force and decreased lateral force (p<0.05). Vertical and propulsive impulses were
increased along with loading rate and positive torque (p<0.01). Braking impulse and vertical
thrust were significantly reduced (p<0.01) as well as maximum propulsive force (p<0.05).

When analysing the curves mean intrasubject CoV’s during loaded 40kg and loaded 40kg
with orthotics, were low for knee flexion (16%, 20%) and ankle flexion (20%, 24%) as well
as ground reaction force (21%, 27%). During load carriage no kinematic differences were
caused by the orthotics (p>0.05), specifically knee flexion was not reduced by the orthotics
Orthotics during loaded and unloaded walking 111

when carrying loads (peak mean knee flexion braced: 22.5°, peak mean knee flexion
unbraced: 25.0°), the maximum range of the orthotic flexion not being reached.

Discussion
Unloaded
Our data show that wearing knee orthotics has no adverse affect upon the gait of healthy
adults. This is different to the effects of a knee-ankle-foot orthosis (Cerny et al, 1990). It
seems that adverse changes in gait caused by orthotics only occur when ankle flexion is
restricted. The small medio-lateral changes observed in the non-dominant limb are consistent
with a greater rigidity at the knee which assists in maintaining balance. This lends support to
the theory that the non-dominant limb acts as the control limb for medio-lateral balance and
the dominant limb is primarily propulsive in its nature (Matsusaka, 1985; Sadeghi et
al, 1997).

Loaded
In our data the 40kg load ranged from 47%–64% of subject body weight. Although load
carriage upto as much as 64% of subject’s body weight does not grossly effect sagittal plane
gait motion, it may cause significant increases in the moments occurring at the hip, knee and
ankle as well as increasing the impact, mediolateral balance forces. In an attempt to attenuate
these changes, knee orthotics were used as an intervention. Our data suggest that wearing
knee orthotics during load carriage does not reduce knee flexion and may cause an increase in
the physiological cost of load carriage. The significant differences in the mediolateral factors
suggest some asymmetry, but the high mean intrasubject CoV’s mask any systematic
differences. The reduction in propulsive force and vertical thrust may relate to the possible
increase in metatarsal flexion at toe-off as suggested by Kinoshita (1985) and Martin et al
(1986) and is supported by the increase in positive torque, vertical and propulsive impulses.
The highly significant differences in propulsive and thrust forces that occur may be due to the
increased inertia of the lower limbs, in agreement with Martin (1985) and DeVita et
al (1996).
In conclusion there was no benefit wearing knee orthotics when carrying loads. Further
work with orthotics in 3D, where the load carried is normalised as a percentage of body
weight and lower limb moments are quantified may show clearer but modest benefits.

Acknowledgements
The Defence Clothing and Textiles Agency, Science and Technology Division supported this
work. We would also like to thank Rene Nevola and Andrew Brammell for their assistance
with data collection and preparation.

References
Anderson, K, Wojtys, E.M, Loubert, P.V, and Miller, R.E. 1992, A biomechanical evaluation
of taping and bracing in reducing knee joint translation and rotation, The American
journal of Sports Medicine, 20, 416–421
Bobet, J and Norman, R.W. 1984, Effects of load placement on back muscle activity in load
carriage, European journal of applied physiology, 53, 71–75
112 DC Tilbury-Davis, RH Hooper, and MGA Llewellyn

Cerny, K, Perry, J, and Walker, J.M. 1990, Effect of an unrestricted knee-ankle-foot orthosis
on the stance phase of gait in healthy persons, Orthopaedics, 13, 1121–1127
Chao, E.Y, Laughman, R.K, Schneider, E, and Stauffer, R.N. 1983, Normative data of knee
joint motion and ground reaction forces in adult level walking, Journal of
Biomechanics, 16, 219–233
DeVita, P, Torry, M, Glover, K.L, and Speroni, D.L. 1996, A functional knee brace alters
joint torque and power patterns during walking and running, Journal of
Biomechanics, 29, 583–588
Ghori, G.M.U and Luckwill, R.G. 1985, Responses of the lower limb to load carrying in
walking man, European journal of applied physiology, 54, 145–150
Hamill, J and Knutzen, K.M. 1995, Types of Mechanical Analysis. In Stead, L (ed.)
Biomechanical basis of human movement, (Williams & Wilkins, Media, USA),
458–489
Harman, E, Man, K.H, Frykman, P, Johnson, M, Russell, F, and Rosenstein, M. 1992, The
effects on gait timing, kinetics, and muscle activity of various loads carried on the
back, Medicine and Science in sports and exercise, 24, S129
Holewijn, M. 1990, Physiological strain due to load carrying, European journal of applied
physiology, 61, 237–245
Kinoshita, H. 1985, Effect of different loads and carrying systems on selected
biomechanical parameters describing walking gait, Ergonomics, 28, 1347–1362
Knapik, J, Harman, E, and Reynolds, K. 1996, Load Carriage using packs: A review of
physiological, biomechanical and medical aspects, Applied Ergonomics, 27, 207–215
Littlepage, G, Robinson, W, and Reddington, K. 1997, Effects of task experience and group
experience on group performance, member ability, and recognition of expertise,
Organisational Behaviour and human decision processes, 69, 133
Martin, P.E. 1985, Mechanical and physiological responses to lower extremity loading
during running, Medicine and Science in sports and exercise, 17, 427–433
Martin, P.E and Nelson, R.C. 1986, The effect of carried loads on the walking patterns of
men and women, Ergonomics, 29, 1191–1202
Matsusaka, N. 1985, Relationship between right and left legs in human gait, from a
viewpoint of balance control. In Biomechanics IX-A, (Champaign, Illinois, USA),
427–430
Osternig, L.R and Robertson, R.N. 1993, Effects of prophylactic knee bracing on lower
extremity joint position and muscle activation during running, The American journal
of Sports Medicine, 21, 733–737
Sadeghi, H, Allard, P, and Duhaime, M. 1997, Functional asymmetry in able-bodied
subjects, Human Movement Science, 16, 243–258
Styf, J.R, Nakhostine, M, and Gershuni, D.H. 1992, Functional knee braces increase
intramuscular pressures in the anterior compartment of the leg, The American journal
of Sports Medicine, 20, 46–49
Vasta, R, Rosenberg, D, Knott, J.A, and Gaze, C.E. 1997, Experience and the waterlevel
task revisited: Does expertise exact a price?, Psychological science, 8, 336–339
Winter, D.A. 1984, Kinematic and kinetic patterns in human gait: Variability and
compensating effects, Human Movement Science, 3, 51–76
INVESTIGATION OF SPINAL CURVATURE WHILE
CHANGING ONE’S POSTURE DURING SITTING

Frederick S.Faiks* and Steven M.Reinecke**

*Steelcase Inc. Grand Rapids MI 49501 USA


**Univ. of Vermont, Vermont Back Research Center,
Burlington VT 05405 USA

As sedentary, static work postures have become increasingly prevalent in our


workplaces, musculoskeletal problems—in particular, low back pain and
discomfort—have also increased. Researchers agree on the importance of
changing one’s posture while providing adequate back support. This study
provides the basis for developing a backrest that accommodates natural
human motion. Kinematic motion of twenty subjects were recorded in a
seated position. While moving between flexion to extension, thoracic
kyphosis increases and lumbar lordosis increases. Thoracic curvature
changed uniformly through the full range of motion (80°–115°). Lumbar
curvature changed only as the thigh-torso angle exceeded 95 °. The path and
rate of curvature of the lumbar spine (L3) is independent of the path and rate
of curvature of the thoracic spine (T6) and is a function of the complex
combined motion of pelvic rotation and variations in spinal curvature. These
findings suggest that a backrest should provide independent lumbar and
thoracic support to ensure that the backrest continues to support one’s
posture while promoting natural patterns of motion of the spine.

Introduction
Sitting is the most frequently assumed posture, approximately 75 percent of the workforce has
sedentary jobs. However, prolonged static sitting is frequently accompanied by discomfort and
musculoskeletal complications that result from sustained immobility (Hult, 1954; Eklund, 1967;
Magora, 1972; Kelsey, 1975; Lawrence, 1977). Reinecke et al. (1985) showed a correlation
between static seated postures and back discomfort concluding that individuals are better able
to sit for prolonged periods when they can change their posture throughout the day.
Several researchers have evaluated the physiologic affects of changing ones posture or
more directly spinal motion. Holm and Nachemson (1983), investigated the effects of various
types of spinal motion on metabolic parameters of canine intervertebral discs. They suggest
that the flow of nutrient-rich fluids to and from the intervertebral discs increases with spinal
movement. Adams (1983) also found that alternating periods of activity and rest, thereby
introducing postural change, further boosts the fluid exchange, helping to nourish the discs.
Grandjean (1980) is another who maintains that alternately loading and unloading the spine
(through movement) is ergonomically beneficial, because the process pumps fluid in and out
of the disc, thereby improving nutritional supply.
114 FS Faiks and SM Reinecke

Chaffin and Andersson (1984) have reported that the two most important considerations in
seating are adequate back support and allowance for movement or postural change. Good
seating should allow a worker to maintain a relaxed, but supported, posture and should allow
for freedom of active motion over the course of the day. Kroemer (1994) noted that a backrest
should allow for stimulation of the back and trunk muscles by moving through, and holding
the back in, various postures. While freedom of movement is beneficial, extended association
of muscle forces on the trunk also generates spinal compression, and a backrest can support
the trunk and serve as a secondary support mechanism, thereby reducing the necessary
muscle forces and reducing the compressive loading of the spinal column.
In summary, active movement and postural changes are inevitable, and in fact desirable,
throughout the day. Schoberth (1962) recommends changing postures around a relaxed,
upright, seated posture to minimize muscular activity and the static muscular load needed for
sitting. Most researchers agree that motion should be incorporated in seating while the body
is being supported in different postures.
Little information is available on spinal curvature and pelvis rotation while a person is moving
in a seat. The objective of this study is to describe the kinematic movement of the upper trunk
and use this information to aid designers in developing a backrest that actively accommodates
natural human motion in a relaxed and unrestricted manner. The resulting backrest system
should support the body, continuously and throughout the entire range of motion, but should not
constrain natural movement. A backrest that naturally moves with a person while continuously
providing support would gain from the physiologic benefits of spinal motion.

Methods
Subjects:
Twenty subjects (10 female, 10 male) participated in the study. Among the men, heights
ranged from 163.2 to 188.4cm (mean 176.2cm) and weights ranged from 59.5 to 93.4kg
(mean 75.9 kg). Among the women, heights ranged from 144.8 to 177.8cm (mean 165.9cm)
and weights ranged from 46.3 to 81.6kg (mean 60.1kg).

Procedures:
Targets, consisting of a light-emitting diode (LED) and a 1cm calculator battery were
attached to the skin over the posterior vertebral body at the following locations: Thoracic
vertebrae, T1–T3–T6–T8–T10–T12, Lumbar vertebrae, L1–L3–L5, (Figure 1) and mid-point
femur and tibia while the subjects were seated in the test fixture. The test fixture allowed
subjects to move, unsupported, between a forward-flexed position (80° trunk-thigh angle) to
an extended, reclined position (115°) without affecting their natural motion (Figure 2).
During the data collection period. Seat-pan tilt was adjusted to three positions: -5° rearward,
0° horizontal and +5 ° forward tilt. Positioned behind the seat pan was a fixed backrest that
served as a “safety backrest.” The backrest provided confidence as a backstop at the fully
reclined position. The backrest was split, with a 20-cm gap between the two lateral supports,
allowing enough room so that the LED targets would not become compressed when subjects
adopted a fully reclined position. Subjects were positioned and adjusted to the test fixture for
seat-pan height (popliteal height) and buttock position. A removable positioning support
ensured that all subjects’ buttocks were positioned in the same location.
Spinal curvature during sitting 115

Figure 1. Position of Figure 2. LED’s depict spinal and pelvic motion


LED markers on spine

Seat-pan depth was 44 cm with a 2.5 cm foam pad upholstered over a flat surface. Once
seated, subjects practiced moving through the full range of motion: 80° forward flexion to
115° extension. Once the subjects felt comfortable and natural with the motion, time-lapse
photographs were taken at a rate of 4 frames per second. Each test position was repeated to
evaluate repeatability. Subjects repeated the motion for all three seat-pan angles: +5° forward
tilt, 0° degrees and -5° backward tilt.
The test procedure was repeated with a 76.2 cm work surface placed in front of the
subject. Subjects’ arms rested on the top of the work surface in the forward-flexed position.

Results
Both lumbar and thoracic curvature were measured using the National Institute for
Occupational Safety and Health (NIOSH) method. Lordosis angles were determined by
drawing a line connecting the points of the corresponding posterior vertebral body at L1 to L3
and L3 to L5. At the superior margins of L1 and L5, perpendicular lines were drawn so that
their intersection formed the angle of lordosis at the lumbar region. Kyphosis angles were
determined by drawing a line connecting the points T1 to T6 and T6 to T12. At the superior
margins of T1 and T12, perpendicular lines were drawn so that their intersection formed the
angle of kyphosis at the thoracic region.

Table 1. Average change in curvature from 80° to 115° forward flexion


116 FS Faiks and SM Reinecke

The average change in thoracic curvature was 3° (S.D. 1.6°) for flexion angles between 80°
to 95° and 2.7° (S.D. 1.7°) between 95° to 115°. The average change in lumbar curvature was
.08° (S.D. 1.1°) for flexion angles between 80° to 95° and 4.5° (S.D. 1.8°) between 95° to 115°.
Pelvic rotation was monitored by the translation of the thigh (femur) posteriorly, as the
pelvis rotates rearward about the ischial tuberosity. Quantitative data for the exact amount of
pelvic rotation could not be obtained. However, qualitative assessments of the magnitude of
pelvic rotation were made. The amount of pelvic rotation was observed by the distance the
thigh moved rearward; the greater the distance, the greater the amount of hip rotation. The
average rearward thigh translation was 5.57cm. (S.D. 1.24)

Conclusion
Thoracic region of the spine
1) Thoracic curvature becomes more kyphotic as
the person reclines (figure 3).
2) Variance among subjects was significant.
3) Change in curvature was consistent for the full
range of motion, from 80° to 115°.
4) Seat-pan angle did not affect thoracic curvature.
5) Subjects displayed greater kyphosis when they
were seated at a work surface.

Lumbar region of the spine


1) Lumbar curvature becomes more lordotic as the
person reclines (figure 3).
2) Variance among subjects was significant. Figure 3. Thoracic kyphotic and
3) Changes in curvature occurred primarily from lumbar lordotic curvature
increases as a person reclines
95° to 115° of the range of motion.
4) Seat-pan angle did not affect lumbar curvature.
5) Lumbar curvature was not affected by the presence of a work surface.

Pelvic rotation
1) Pelvic rotation decreased when subjects were seated at a workstation.
2) Variance among subjects was significant.
3) Seat-pan angle did not affect pelvic rotation.

Discussion
Seating design should be considered from the perspective of the end users and their postural
requirements. Thus, a main objective should be to determine the ways in which a chair can
support the body while, at the same time, providing for unrestricted movement. One should
expect a chair to conform to, or accommodate, the body, rather than expecting the user to
conform to the shape of the chair. In order to refine design criteria consistent with this expectation,
this study was conducted to record kinematic motion of the back during unrestricted movement.
Spinal curvature during sitting 117

It was found that the motion of the upper trunk represents a combination of spinal
movement and pelvic rotation. As a seated individual moves from a forward-flexed position
(80° trunk-thigh angle) to a reclined position (115°), both thoracic kyphosis and lumbar
lordosis increases. The path and rate of motion of the lumbar spine (L3) are independent of
the path and rate of motion of the thoracic spine (T6), additionally, both parameters vary with
the complex, combined motion of pelvic rotation, as well as changes in spinal curvature.
To provide maximal support, a chair’s backrest should follow the motion of the back
while the seated individual changes position. The backrest must, therefore, be flexible
enough to provide continuous support in both an upright and reclined position. This study
demonstrates the need for a backrest that can change its contouring as an individual moves.
The thoracic region of the back requires a backrest that is capable of providing an
increasingly concave surface as one reclines further backward, while the lumbar region
requires a surface that is capable of increasing in convexity. Chairs which feature a single-
plane surface, cannot provide this type of support.
In contrast, a dynamic backrest, one with a changing surface contour, will ensure that the
back is supported in all natural seated postures. Knowledge gained from this study of motion
can lead to a design solution that addresses the complexities of human movement and one that
provides more comfortable and healthy seating than do conventional chair designs.

References
Adams M.A. 1983, The effect of posture on the fluid content of lumbar intervertebral discs.
Spine. Volume 8, No. 6
Bendix T., Winkel J., Jessen F. 1985, Comparison of office chairs with fixed forwards of
backwards inclining, or tiltable seats. European Journal of Applied Physiology, 54:
378–385
Chaffin D.B., Andersson G.B.J. 1984, Occupational Biomechanics, New York, (John Wiley
& Sons)
Eklund M. 1967, Prevalence of musculoskeletal disorders in office work. Socialmedicinsk,
6, 328–336
Grandjean E. 1980, Fitting the Task to the Man, Third Edition, (Taylor and Francis, London)
Holm S., Nachemson A. 1983, Variations in nutrition of the canine intervertebral disc
induced by motion. Spine, 8(8):866–874
Hult L. 1954, Cervical, dorsal and lumbar spine syndromes. Acta Orthopaedic Scandinavia
(Supplement 17)
Kelsey J. 1975, An epidemiological study of the relationship between occupations and acute
hemiated lumbar intervertebral discs. International Journal of Epidemiology, 4, 197–205
Kroemar R. 1994, Sitting (or standing?) at the computer workplace. Hard Facts about Soft
Machines The Ergonomics of Seating. Edited by Lueder R. and Noro K., (Taylor &
Francis, London), 181–191
Lawrence J. 1977, Rheumatism in populations. (London: William Heinemann Medical
Books Ltd)
Magora A. 1972, Investigation of the relation between low back pain and occupation. 3.
Physical requirements: Sitting, standing and weight lifting. Industrial Medicine, 41,
5–9
Reinecke S., Bevins T., Weisman J., Krag M.H. and Pope M.H. 1985, The relationship
between seating postures and low back pain. Rehabilitation Engineering Society of
North America, 8th Annual Conference, Memphis, Tenn.
Schoberth H. 1962, Sitzhaltung, Sitzschaden, Sitzmobel. (Springer-Verlag, Berlin)
THE EFFECT OF LOAD SIZE AND FORM ON TRUNK
ASYMMETRY WHILE LIFTING

Gail Thornton* and Joanna Jackson**

*Formerly of Coventry University


**School of Health and Social Studies, Colchester Institute

The influences of load characteristics on trunk asymmetry during a lifting


manoeuvre were investigated. Asymmetry was defined as the degree of
rotation and side flexion occurring in the thoracolumbar spine. Using a same
subject, repeated measures design angular range of motion in the coronal and
transverse planes was measured using a tri-axial goniometer. Objects of
equal mass but different dimensions and form were used. ANOVA revealed
that there was no significant difference in the range of motion during the
lifting of the three objects. It was concluded that load size and form did not
significantly influence the degree of trunk asymmetry while lifting.

Introduction
The association between manual materials handling and occupational low back pain has been
well documented and widely reported. Many ergonomic evaluation techniques rely on
evidence based upon static biomechanical assessments of spinal loading during lifting
(Marras et al, 1993). In addition, many of these assessments only consider sagittally
symmetrical positions of the body. Waters et al (1995) demonstrated that consideration of the
dynamic components of lifting could be crucial to a proper understanding of the causes of
back injury.

The effects of asymmetry of the trunk during lifting have been strongly associated with an
increased risk of back injury. Significant risk factors for back injury include repetitive
twisting or side bending when lifting, even when loads are relatively light (Bigos et al, 1986).
Asymmetry (twisting and side bending) has been found to influence individual capability by
reducing trunk strength and increasing the degree of strain put on the intervertebral disc
(Kelsey, 1984; Shiraz-Adl, 1989). The dynamic components of lifts, defined as angular
ranges of motion, acceleration and velocity (Marras et al, 1993) have also been associated
with increased spinal loading.
Effect of load size and form on trunk asymmetry while lifting 119

Features of the object/load to be lifted, such as its size, bulk and unpredictability have also
been linked to the possibility of manual materials handling becoming more hazardous. Lift
styles, lift frequency, weight of load, size of load and static strength have all been
investigated. There appears to be a paucity of literature examining the effect of load size and
form on dynamic three-dimensional motions of the trunk. The degree of asymmetry occurring
with lifting varying loads of the same weight may give an indicator of risk. Marras et al
(1995) identified the need to establish trunk motion characteristics of in “vivo” occupational
lifting conditions to give improved reasoning about the mechanisms involved with back
injury. Knowledge of three-dimensional spinal position is one the key elements he identifies
as essential in any evaluation of injury risk during manual handling.

During this study asymmetry was defined as the degree of rotation and side flexion taking
place in the trunk. Rotational movements occur in the transverse plane of the body, about a
vertical axis. Side flexion occurs in the coronal plane about a sagittal axis.

Methodology
A same subject repeated measures experimental design was used. All subjects were measured
under three different lifting conditions. The objects to be lifted had a mass of 8kg; two were
boxes of different dimensions and one was a beanbag. The aim was to simulate loads that were
bulky and awkward, small and compact and with contents that were susceptible to shifting.

A convenience sample of fifteen subjects was used from a student population. None of the
subjects had any previous known back injury or pathology. None of the subjects had received
any formal training in manual materials handling. All subjects volunteered for the study and
gave written informed consent to participate.

During each lift spinal motion was recorded using the lumbar motion monitor (LMM). The
LMM is an electrogoniometer which is capable of measuring the instantaneous position of
the thoracolumbar spine in three dimensions (Marras, 1995). The LMM consists of an
exoskeleton, which represents the posterior guiding system of the spine; this is attached to the
subject by a two piece harness allowing the LMM to track the subject’s trunk motion. The
validity of the LMM has been established (Marras, 1992) as has its inter-tester and intra-
tester reliability (Gill and Callaghan, 1997). The LMM is calibrated in its carrying case and
this gives it a zero position for all three planes of movement. A subject wearing the LMM will
therefore demonstrate a negative reading in the sagittal plane when in relaxed standing which
will represent the lumbar lordosis.

Each subject wore the LMM with the harness fitted in relation to set bony landmarks. The upper
metal edge of the pelvic harness was aligned with the junction between the L5/S1 vertebrae and
the lower metal edge of the thoracic harness was aligned with the inferior angle of the scapula.

All subjects were informed that the load would be no greater that 10kg. The subjects lifted the
three objects in a random order to try and eliminate order effects. Each subject stood at a
marked point 125cm in front of a plinth and the object to be lifted was placed on the floor
120 G Thornton and J Jackson

25cm in front of the subject. The subject was instructed that on the command “1 2 3 Go” they
were to lift the object and place it onto the plinth, they could move freely using any style of
lift they wished. Each subject lifted the three objects during one test session.

Data was collected using the Industrial software available for the LMM. The range of motion
analysed was calculated using the upper and lower range summary statistics. This is based on
the peak ranges of motion recorded for, in this study, the movements of side flexion and
rotation to the left and right. Statistical analysis was undertaken using SPSS for windows.

Results
Fifteen healthy subjects were used in the study; ten females and five males.

Table 1. Subject Data

Data was analysed using SPSS for windows, release 6.0. Descriptive data is presented in
table 2.

Table 2. Sideflexion and rotation in the thoracolumbar spine

A one way analysis of variance (ANOVA), repeated measures, was computed on each set
of data (side flexion and rotation). Results are presented in table 3 and 4.

Table 3. ANOVA for variable rotation


Effect of load size and form on trunk asymmetry while lifting 121

Table 4. ANOVA for variable side flexion

The results from the ANOVA suggest that changes in load size or form did not
significantly affect the degree of asymmetry taking place in the trunk during lifting. Closer
examination of the descriptive data of individual subjects revealed that they appear to fall into
two subgroups: one demonstrating very little variation in spinal motion between the different
conditions, the other having considerable variation when lifting the different objects. The
majority of subjects demonstrated more rotation taking place than side flexion when lifting
any of the three objects.

Discussion
The results of this study did not show a significant difference in spinal motion when lifting
three different loads, however all the tasks did produce substantial three-dimensional motion
of the trunk. The following limitations and observations of the study should be considered:

• Only a small number of subjects were used, within this number a clearly
identifiable number appeared to have obvious differences in spinal motion
between the different conditions.
• The subjects were allowed little time to acclimatise to wearing the LMM. It could
have influenced, altered or restricted subjects’ movement.
• The hand position used during the lift will have influenced trunk motion. This
may have changed during the task once the lifter had assessed the weight of the
object.
• There appeared to be more spinal motion when lifting the small box. The large
box could have made subjects more cautious so a more controlled lifting style
may have been adopted. This may also relate to the influence of weight
knowledge on lifting technique where there is anticipation of the amount of effort
required to lift a load.
• Lift style was not controlled in the study. Making a visual recording of subjects’
lifting technique would have allowed for consideration of this variable and of the
effect of hand position.
• The range of motion analysed was the range through which the subjects moved in
side flexion and rotation. Consideration could also have been given to the
maximum range of movement achieved in any direction and to the interactions
between the movements.
• It should not be forgotten that there are many factors which can increase the risk
of injury whilst lifting.
122 G Thornton and J Jackson

References
Bigos, S.J., Spengler, D.M., Martin, N.A., Zeh, J., Fisher, L., Nachemson, A. and Wang,
M.H. 1986, Back injuries in industry: A retrospective study. 11. Injury factors, Spine,
11, 246–251
Gill, K.P. and Callaghan, M.J. 1997, Intratester and intertester reproducability of the lumbar
motion monitor as a measure of thoracolumbar range, velocity and acceleration,
Clinical Biomechanics, 11, 418–421
Kelsey, J.L., Githens, P.B., White, A., Holford, T., Walter, S.D., O’Connor, T., Ostfield,
A.M., Weil, U., Southwick, W.O. and Calogero, J.A. 1984, An epidemiological study
of lifting and twisting on the job and risk for acute prolapsed lumbar intervertebral
disc, Journal of Orthopaedic Research, 2, 61–66
Marras, W.S., Fathallar, F.A., Miller, R.J., Davis, S.W. and Mirka, G.A. 1992, Accuracy of a
three-dimensional lumbar motion monitor for recording dynamic trunk motion
characteristics, International Journal of Ergonomics, 9, 75–87
Marras, W.S., Lavender, S.A., Leurgans, S.E., Sudhakar, L.R., Allread, W.G., Fathallah, F.A.
and Ferguson, S.A. 1993, The role of dynamic three-dimensional trunk motion in
occupationally-related low back disorders, Spine, 18, 617–628
Marras, W.S., Lavender, S.A., Leurgans, S.E., Fathallah, F.A., Allread, W.G. and Sudhakar,
L.R. 1995, Biomechanical risk factors for occupationally related low back disorders,
Ergonomics, 38, 377–410
Shiraz-Adl, A. 1994, Biomechanics of the lumbar spine in sagittal/lateral moments, Spine,
19, 2407–2414
Waters, T.R., Andersen, V.P., Garg, A. and Fine, L.J. 1993, Revised NIOSH equation for the
design and evaluation of manual lifting tasks, Ergonomics, 36, 749–776
THE EFFECT OF VERTICAL VISUAL TARGET LOCATION
ON HEAD AND NECK POSTURE

Robin Burgess-Limerick, Anna Plooy, & Mark Mon-Williams

Department of Human Movement Studies


The University of Queensland, 4072
AUSTRALIA

Twelve participants viewed a visual target placed in 6 vertical locations


ranging from 30° above to 60° below horizontal eye level. This range of
vertical target location was associated with a 37° change in head orientation,
and a 53° change in gaze angle with respect to the head. The change in head
orientation was predominantly achieved through changes in atlanto-occipital
posture. Consideration of these data in light of preferred gaze angle data, and
neck muscle length/tension relationships, suggests that visual targets should
be located at least 15° below horizontal eye level.

Introduction
A change in the vertical location of visual targets influences both the vertical gaze angle of
the eyes relative to the head, and the orientation of the head relative to the environment
(Burgess-Limerick et al., in press; Delleman, 1992). In general, a large range of vertical gaze
angles and head orientations might be combined to view a visual target in any particular
vertical location. Similarly, any given head orientation may be achieved through combining a
large range of trunk orientations, cervical postures, and positions of the atlanto-occipital
joint. The aims of this paper are: (i) to describe the gaze angles and postures adopted to view
a large range of vertical target locations; and (ii) to explore the consequences of these changes
in terms of potential musculoskeletal discomfort.

Method
Twelve participants self-selected the height and backrest inclination of an adjustable chair. A
small screen television (4.5×5.5 cm) connected to a video player was mounted at 15°
intervals on a 65 cm arc. The arc was positioned so that its centre was at the same height as
each participant’s eyes in the self-selected sitting position. The television was placed in six
positions; +30, +15, 0, -15, -30, – 45 and -60° with respect to a virtual horizontal line at eye
level. Each position was presented 3 times in random order. A modified Stroop task (the
124 R Burgess-Limerick, A Plooy, and M Mon-Williams

television displayed a single word written in a contradictory colour, e.g. the word “red”
would appear written in green and participants were requested to name the word rather than
the colour it was written in) was performed for one minute in each trial with data collection
(at 10 Hz) occurring during the last 10 seconds.
Optotrak (Northern Digital) provided the 3 dimensional location of infra red emitting
diodes placed on the outer canthus (OC), the mastoid process (MP) on a line joining the
external auditory meatus and the outer canthus, spinous process of the seventh thoracic
vertebra (C7), and the greater trochanter (GT). The markers were used to define head and
neck angles in the sagittal plane. These angles were used to describe the position of the head
and neck modelled as three rigid links articulated at two pin joints at the atlanto-occipital
joint and C7. The position of the head with respect to the external environment was described
by calculating the position of a line joining OC and MP markers (the ear-eye line) with
respect to the horizontal. Gaze angle is reported with respect to the ear-eye line.

Results and Discussion


The average effect of vertical target location on vertical gaze angle and posture is summarised
in Figure 1. Participants responded to changes in visual target location with an approximately
linear change in both head inclination (as described by ear-eye position) and gaze angle.
Fixation on a visual target which varied through a 90° vertical range was achieved by an
average change in head orientation (ear-eye line) of 37° and a change in gaze angle relative to
the head of 53° (the average ratio of head inclination to gaze angle change was 0.70). Whilst
all participants exhibited linear changes in both variables (all individual participant
correlations were greater than 0.93), considerable individual differences existed in the ratio of
changes in head orientation to changes in gaze angle relative to the head (from 0.45 to 1.12).
Changes in head orientation were achieved predominantly through altering the position of
the atlanto-occipital joint (measured here as head angle) and, to a lesser extent, by changing
cervical posture (neck angle). The average change of 37° in head orientation across the target
locations was produced by an average change in atlanto-occipital position of 28°, a 7° change
in the posture of the cervical spine, and a 2° change in trunk inclination.
Previous research suggests that subjective preference is for visual targets to be located such
that the eyes are rotated downwards relative to the head. Kroemer and Hill (1986) reported the
average preferred gaze angle as 35° below the Ear-Eye line for visual targets at 1m.
We suggest that the reason for this preference is as follows. An observer’s eyes must
converge to maintain single vision of near visual targets. Ocular vergence is produced by
activation of the medial recti muscles of the eye. The muscles responsible for raising the eye
(the superior obliques) also create a horizontal divergent force on the eye. Raising the eyes to
view a target thus requires increased activation of the medial recti to maintain single vision.
Conversely, the muscles which lower the eyes also tend to create convergence, thus reducing
the activation required by medial recti. Visual discomfort may result from prolonged high
activation levels of medial recti. Indeed, anomalies of vergence are considered to be the
primary cause of visual discomfort when fixating near targets (Mon-Williams, et al., 1993).
This simple mechanical model explains why observers prefer to look downwards to view near
targets. It also explains why the preferred vertical gaze angle gets progressively lower as
objects get closer (Kroemer & Hill, 1986).
Effect of vertical visual target location on head and neck posture 125

Figure 1: (A) Posture as a function of target position; (B) Gaze angle and head
orientation as a function of target position; and (C) schematic representation of
postural and gaze angle changes as a function of selected target positions.
126 R Burgess-Limerick, A Plooy, and M Mon-Williams

Interpretation of the consequences of the postures adopted to view different visual targets
also requires consideration of the biomechanics of the head and neck. The head and neck
system comprises a rigid head located above a relatively flexible cervical spine. Flexion and
extension are possible at the atlanto-occipital, and cervical joints. The ligaments and joint
capsules are relatively elastic, especially within the mid range, and a large range of movement
is possible without significant contribution from passive tissues.
The centres of mass of the head, and the head and neck combined, are anterior to the
atlanto-occipital and cervical joints. Consequently, when the trunk is vertical, extensor
torques about the atlanto-occipital and cervical joints are required to maintain static
equilibrium. A large number of muscles with diverse sizes, characteristics, and attachments
are capable of contributing to these torques. The suboccipital muscles, which take origin on
C1 and C2 and insert on the occipital bone, are capable of providing extensor torque about the
atlanto-occipital joint only; others (such as semispinalis capitis) provide extensor torque
about cervical as well as atlanto-occipital joints; while others provide extensor torque about
cervical vertebrae only.
Increased flexion at the atlanto-occipital joint increases the horizontal distance of the
centre of mass of the head from its axis of rotation (level with the mastoid process). Similarly,
with the trunk in a vertical position, an increase in flexion of the cervical spine increases the
horizontal distance of the centre of mass of the head and neck combined from the axes of
rotation in the vertebral column (and all else remaining the same, the horizontal distance of
the head from its axis of rotation). Hence, with the trunk approximately vertical both atlanto-
occipital and cervical flexion increases the torque required of the extensor musculature to
maintain static equilibrium.
However, the head and neck complex is inherently unstable, especially in the upright
position (Winters & Peles, 1990), and the neck muscles must do more than just balance the
external forces acting on the system. For the system to be stable, additional co-contraction is
required to increase the stiffness of the cervical spine and prevent buckling. The consequence
is that significant muscular activity is probably still required even if the head and neck are
positioned to minimise the flexor torque imposed by gravitational acceleration. Indeed, the
necessity for muscle activity to stabilise the cervical spine is likely to be greater when it is
relatively extended (Winters & Peles, 1990).
The tension generating capability of a muscle is highly dependent on its length. In
general, changes in posture at the atlanto-occipital and cervical joints will alter both the
moment arm and the average fibre length of muscles active to provide both the required
extensor torque and stiffness. While accurate measurements of moment arm and fibre length
changes are unavailable, it is clear that muscle fibres which produce extensor torque will be
shortened to some extent by increased extension of the head and neck. The best estimates
available (Vasavada et al., in press) suggest that the force generating capabilities of the
muscles which cross the atlanto-occipital joint rapidly deteriorate with extension of the
atlanto-occipital joint from neutral.
The suboccipital muscles in particular are relatively short, and even a small change in
average fibre length caused by extension of the atlanto-occipital joint is likely to cause
significant decrement in their tension generating capabilities. Yet it is precisely these muscles
which appear to be primarily responsible for vertical movements about axes high in the
cervical spine (Winters & Peles, 1990).
Effect of vertical visual target location on head and neck posture 127

In this experiment participants responded to a 90° change in vertical target location by


changing cervical posture by only 7° on average, but changed the position of the atlanto-occipital
joint by nearly 30°. A neutral posture corresponds to a posture in which the ear-ear line is
approximately 15° above horizontal (Jampel & Shi, 1992). This corresponds to the average
posture adopted by participants in this experiment when the the visual target was located between
15° and 30° below horizontal eye height. Higher visual targets are likely to lead to postures in
which the tension generating capabilities of the sub-occipital muscles are reduced.
These relationships between joint angle and tension generating capabilities may account
for the observation by Jones et al. (1961) that the most comfortable sitting posture
corresponded to a head orientation in which the ear-eye line was approximately horizontal.
Such a posture would correspond to the average posture adopted by participants in this
experiment when the vistual target was more than 45° below horizontal eye level.
In summary, the implications of this experiment are that workplaces should be designed
such that visual displays are located at least 15° below horizontal eye level, and possibly
lower, depending on the distance of the visual target from the observer.

References
Burgess-Limerick, R., Plooy, A., Fraser, K., & Ankrum, D.R. in press, The influence of
computer monitor height on head and neck posture. International Journal of
Industrial Ergonomics.
Delleman, N.J. 1992, Visual determinants of working posture. In M.Mattila and W.
Karwowski (Eds.). Computer applications in ergonomics, Occupational Safety and
Health. (Elsevier, Amsterdam). 321–328.
Jampel, R.S. and Shi, D.X. 1992, The primary position of the eyes, the resetting saccade,
and the transverse visual head plane. Investigative Opthalmology and Visual Science,
33, 2501–2510.
Jones, P.P., Gray, F.E., Hanson, J.A. & Shoop, J.D. 1961, Neck-muscle tension and the
postural image. Ergonomics, 4, 133–142.
Kroemer, K.H.E. and Hill, S.G. 1986, Preferred line of sight angle. Ergonomics, 29, 1129–
1134.
Mon-Williams, M., Plooy, A., Burgess-Limerick, & Wann, J. in press, Gaze angle: A
possible mechanism of visual stress in virtual reality headsets. Ergonomics.
Mon-Williams, M., Wann., J., & Rushton, S. 1993, Binocular vision in a virtual world:
Visual deficits following the wearing of a head mounted display. Opthal. Physiol.
Opt., 13., 387–391.
Vasavada, A.N., Li, S., & Delp, S.L. in press, Influence of muscle morphometry and
moment arms on the moment-generating capacity of human neck muscles. Spine.
Winters, J.M. and Peles, J.D. 1990, Neck muscle activity and 3-D head kinematics during
quasi-static and dynamic tracking movements. In J.M.Winters S.L.-Y. Woo (eds.).
Multiple muscle systems: Biomechanics and movement organisation, (Springer
Verlag, New York). 461–480.
OFFICE ERGONOMICS
IS A PRESCRIPTION OF PHYSICAL CHANGES SUFFICIENT
TO ELIMINATE HEALTH AND SAFETY PROBLEMS IN
COMPUTERISED OFFICES?

Randhir M Sharma

Division of Operational Research and Information Systems


The University of Leeds
Leeds
LS2 9JT

Although a highly controversial and emotive issue, it appears that finally a


time has arrived when it has becoming ‘acceptable’ to suggest that certain
health conditions may have computer use as a significant contributory factor.
Directives issued by the European Commission under article 118a of the Treaty
of Rome are widely recognised as the way forward. They consist of
recommendations designed to reduce the likelihood of health problems arising
as a result of computer use. The primary focus of these directives is the physical
components of the office. They are essentially a prescription of physical changes.
However, the work environment comprises both physical environment and job
organisation (Choon Nam Ong 1990). Can any approach which focuses solely
on the physical components be successful in reducing health complaints?

Introduction
It is estimated that by the turn of the century at least two out of every three people who work will
use a VDU (Bentham 1991). The potential costs of computer use related health problems are
therefore immense. Sharma (1996) presented the results of a survey conducted amongst staff
and students of the School of Computer Studies at the University of Leeds and computer users
employed by a large newspaper in India. The results indicated that many users were suffering, or
had suffered from health problems at some time. Most importantly, with regard to the work described
in this paper the results showed that most users felt that they did not know enough about the
subject of health and safety and that more could be done to inform them. The starting point of the
experiment described in this paper is the observation that users had a variety of preferences for
the techniques which could be employed to educate them about health and safety problems. The
two most popular suggestions made were firstly, to use an introductory lecture and secondly, to
distribute information in a variety of formats. In addition to these suggestions it was agreed that
it would be useful to assess the effect of an ‘ideal working environment’.

Method
The experiment was conducted with the staff in India who had participated in the original
survey. There were three reasons for the choice of participants. Firstly, it was not possible due
to the high turnover of both staff and students at the School of Computer Studies to ensure
that those who had taken part in the initial survey would be able to take part in the
experiment. Secondly, funds for the experiment were made readily available in India. Finally,
the level of understanding concerning health and safety issues did not display a large
variance. This would allow any changes to be identified much more easily.
Users were split into four groups. After an initial survey, interventions were made in three of the
four groups. After three months the groups were surveyed again in order to assess whether the
interventions made had resulted in the reduction of any health problems. The first survey was carried
out in September 1996, the second follow up survey was conducted in January 1997. Although the
Health and safety problems in computerised offices 131

time between the two surveys was quite short the results obtained suggest that it had been sufficient
to allow changes to manifest themselves. The interventions made are detailed below;

Group 1.
Existing chairs were replaced with new ergonomically designed chairs with adjustable
height, rake and armrests. Glare screens were provided in order to counter the effects of
fluorescent lighting. Wrist rests were introduced in order to provide cushioning for arms and
wrists. Copy holders were supplied to allow information which was being typed to be in the
line of view of operators. The height of display units was raised to encourage users to adopt
an upright working posture. Finally display screens were repositioned in order to encourage a
greater distance between the user and the screen.
An important point to note is that users were not given any guidance about how the
equipment should be used or any additional information on working practises. The aim in
doing this was to simulate typical work environments where often both ergonomic furniture
and accessories are introduced, without advice about how these should be used.

Group 2.
The second group was provided with information in a number of different formats. Booklets
and leaflets containing information about various health problems were located in offices.
These were placed in such a way that they could neither be removed from offices or obscured.
There was however no compulsion for the users to the read this information. Two types of
poster were also positioned on the walls of the offices used by this group. The first type were
of A3 size, these contained single sentences in large size bold fonts. The sentences used
contained guidelines about working distance, frequent breaks and a reminder to stretch
regularly. The second type of poster illustrated a ‘correct’ working posture.

Group 3.
The third group were given an introductory forty five minute talk about health problems
associated with computer use, and the steps that could be taken to eliminate these problems.
The problems covered were musculoskeletal injuries, visual problems and postural problems.
The talk also contained a question and answer session which allowed the audience to ask
about anything which they did not understand.

Group 4.
The fourth group was used as the control group. No further intervention was made in this group.

Results
The survey focused on four key areas, these were working habits, frequency of problems,
contribution of particular components to problems and finally the level of knowledge about
health and safety. The working habits examined were the number of hours per week spent
using a computer, the time between breaks, the distance between the user and the screen and
whether or not the user stretched before or during working. The questions concerning the
frequency of problems and the level of blame attached to each component both used a four
point scale. In the case of problem frequencies 1 indicated a high frequency and 4 indicated
never. For the questions regarding the level of blame 1 indicated a large contribution and 4
indicated no contribution at all.
The results produced were analysed using the ‘Wilcoxon Signed Rank Test’.

Group 1
Surprisingly the frequencies of certain health problems rose. The mean problem frequency ratings
can be seen in the table on the next page. When the working habits of the group were assessed it
was found that users were working a greater number of hours per week and sitting closer to the
screen, despite the fact that monitors had been repositioned. Users were also working longer
132 RM Sharma

between breaks. Fewer users were stretching before or during work. There was no change in the
level of knowledge of terminology. Surprisingly when asked if they felt that they knew enough
about health and safety there was 35% increase in the number of users who felt that they knew
enough, this despite there being an increase in the frequency of health problems. When questioned
about the factors that they considered responsible for their problems there was a significant reduction
in the level of blame attached to those components which had been modified or replaced.

Table 1. Mean Problem Frequency Ratings


n<denotes that the sample size was not large enough to make a comparison.

Group 2
The frequencies of several health problems fell quite dramatically in this group. This fall can
be attributed quite clearly to the changes in the working habits of the users. Although the
number of hours per week spent using a computer did not change the time between breaks
fell, the distance between the user and the screen increased and the number of users who
stretched before or durine work increased by 65%.The level of familiarity with terminology
also rose, after three months 50% of respondents were familiar with the terms RSI and
WRULD. Despite this increase in knowledge and the reduction in health problems none of
Health and safety problems in computerised offices 133

the respondents felt that they knew enough about health and safety. When questioned about
components of the office which users felt were responsible for problems there was no change
from the previous survey. It appeared that despite being better informed users were unable to
isolate particular components of the office as being responsible.

Group 3
Group 3 showed very little change. The introductory talk, although simple in its’ content did
not have any lasting effect. There was no change in the frequency of health problems or in the
working habits of the users. There was also no change in the level of familiarity with
terminology or the level of blame attached to individual components. The only observation
which was made from this group was that surprisingly some of the users in this group now
felt that they knew enough about health and safety.

Group 4
Group 4 was the control group, no changes were observed in either the frequency of health
problems or the working habits of users. The level of knowledge about health and safety and
the level of blame attached to individual components were also unchanged.

Discussion
The worrying aspect about group 1 is that it was an attempt to simulate a typical employer
response. Modifications are often made to working environments because of ergonomic
concerns without any supporting guidance as to how this equipment should be used. The
results of group 1 indicate that physical changes alone are not sufficient to reduce the
frequency of health complaints. The significant change in the working habits of the users is
also a matter of concern. The results indicated that the users had adopted working habits
which had countered any benefits that may have been obtained from the changes made.
The most interesting observation concerned the level of blame attached to individual
components. Although the frequency of certain health problems had risen, the level of blame
attached to individual components of the office fell. The author believes that the group was
reluctant to blame those components of their office which they believed were causing them
problems because they had been modified or replaced by an ‘expert/authority’ figure.
If the results obtained here are typical, they have very serious implications. It appears that
by simply modifying the environment, we may do little more than silence complaints. The
frequency of problems had risen, yet despite this there was a decrease in the blame given to
individual components, there was also an increase in the number of people who were satisfied
with their level of knowledge. Generally the consensus among users was that all that could be
done had been done and that little more could be done to help. This is dangerous, we do not
want health problems as a result of computer use to be seen as the norm.
The changes made to the environment of group 1 appeared to have done more harm than
good. This does not however mean that they are not necessary. Much research has been done
to assess the effectiveness of ergonomic aids. This research has shown that there are benefits
to be achieved. The results obtained from group 2 suggest that in order for them to be
effective they have to implemented in parallel with a package of measures which educate
users about the manner in which they should be used.

Group 2 displayed a significant reduction in health problems after only three months. There
was also a significant change in the working habits of the users. These improvements were
obtained without the introduction of changes to the working environment. The importance of
the format in which information is delivered was clearly demonstrated by this group.
Individuals learn differently and favour different techniques. As with any form of education it
is important that the audience is given due consideration when the mechanisms of delivery for
such information are considered. The information distributed to group 2 was in a variety of
formats in order to ensure as
134 RM Sharma

The final observation from group 2 was that improvements were made at a minimal cost.
The cost of the improvements was the paper on which the information was printed. The costs
of preventative measures, and in particular the costs of meeting legislative requirements are
often cited as an obstacle to the implementation of ‘best practice’ as defined by legislation
(Gough & Sharma 1998). The results of group 2 have shown that a healthy workplace can be
achieved without the need for a large scale financial investment.

An introductory lecture or presentation was one of the most popular suggestions made by
respondents in the initial survey. The results of group 3 illustrate clearly how unsuccessful
this technique was. Overall little change was achieved, the presentation had very little effect
on the group. It is immediately apparent from these results that a more comprehensive
approach is required. Unfortunately in many organisations users do not even receive this. The
group had not remembered any of the presentation and realistically could not really have been
expected to. They had been forced to attend a single presentation and then asked to try and
absorb information which was completely alien to them. They had no feedback and no one to
whom they could ask questions or from whom they could receive guidance.

The results of group 4 indicated that little change had occurred in the the group during the
three months. This lack of change was important if we are to assume that the changes
witnessed in the other groups occurred as a result of the interventions made.

Conclusion
The approach used for group one was a component based approach. The focus of the
approach was on the physical characteristics of the working environment. The results of this
approach were that firstly the frequency of problems had increased and secondly the blame
attached to those components which had been modified had fallen. The users were suffering
but were unable to pinpoint why.
The key problem with article 118a is that it is primarily component based. The document
focuses mainly on the hardware found in computerised offices. Regulation 7 looks at the provision
of information, section la of this regulation states ‘Every employer shall ensure that operators
and users at work in his undertaking are provided with adequate information about all aspects
of health and safety relating to their workstations’. This particular guideline is however all too
easily forgotten. The results of group 2 indicate that a much greater amount of attention needs
to be paid to this regulation. Only by educating users can we equip them to deal with the health
problems which have been shown to be linked to computer use. Simply giving users the latest in
furniture and ergonomic accessories is no guarantee of a healthier workplace.
Any methodology which seeks to eliminate health and safety problems must have as its’
foundation a comprehensive programme of education. More importantly however this
programme needs to be tailored to the needs of the user. It has to contain the information
users need in the format that users want. Otherwise resources spent on modifying
computerised offices in order to protect users will do little more than harm them further.

References:
Bentham P. 1991, VDU Terminal Sickness: Computer Risks and How to Protect Yourself.
(Green Print)
Choon-Nam Ong, 1990, Ergonomic Intervention for Better Health and Productivity. in
Promoting Health and Productivity in the Computerised Office, edited by S.Sauter.
(Taylor & Francis,)
HSE 1992. Display Screen Equipment Work: Guidance on Regulations.
Gough T.G. & Sharma R.M, Health & Safety in Information Technology—an International
Educational Issue, accepted for BITWORLD 98’.
Sharma R.M. 1997, Health and Safety in Computerised Offices, The Users Perspective. in
S.A.Robertson (ed) Contemporary Ergonomics 1997, (Taylor & Francis, London),
257–262.
AN EVALUATION OF A TRACKBALL AS AN
ERGONOMIC INTERVENTION

Barbara Haward

Robens Institute for Health Ergonomics


University of Surrey
Guildford, GU2 5XH

In the workplace, intensive mouse users are observed to develop work


related musculoskeletal disorders more frequently than keyboard only users.
As an intervention measure to assist users with symptoms, a trackball has
been used, but never strictly evaluated. A workplace study was undertaken
with experienced mouse users performing their normal work activities using
a questionnaire to gather data and select subjects for follow up study. In the
small follow up study, changing the input device had little effect on subjects’
perceptions of tiredness and discomfort and work practices. Subjects
perceived fatigue and discomfort as related to work load and long working
hours rather than to the input device. This, associated with observations on
motivation levels and degree of control over their work, indicates the
importance of psychosocial factors in the work system.

Introduction
Worldwide use of information technology has been increasing rapidly since the early 1970s,
especially with the introduction of low cost, smaller computer hardware resulting in the
proliferation of desktop computers. Software development has led to the introduction of more
accessible packages that rely on Graphical User Interfaces (GUIs). These incorporate
windows, ‘pop up’ menus, icons and dialogue boxes on the computer screen. Most available
software packages use GUIs and are dependent on input devices such as the mouse, using
‘point and click’ operations. With the increasing use of these in the workplace, it is likely that
mouse users are at risk of developing work related musculoskeletal problems as a result of
postures adopted and the nature and duration of work tasks.
Little epidemiological research has been undertaken in this area despite mice being a
standard feature of most desktop computer configurations. Research carried out has been on
simulated activities of short duration (eg. Karlqvist, 1994) or on mouse users who use the
device for short time periods (Hagberg, 1994).
Real workplace problems have been encountered with computer users who use mice
intensively to carry out work tasks, often 6–8 hours per day. These users have been observed
to develop and report work related musculoskeletal disorder symptoms at an apparent greater
frequency than keyboard only computer users. As an intervention measure, an alternative
input device—a trackball, has been used in the workplace for those reporting symptoms. This
136 B Haward

has enabled users to continue working and has appeared to ‘help’ in the reduction of
symptom severity.
The use of a trackball as an intervention measure has never been strictly evaluated therefore
the study aims were to:

1. Evaluate whether changing a mouse for a trackball had any effects on :


(a) Subjective perception of fatigue and discomfort.
(b) Exposure to two of the possible factors implicated in the onset of work related
musculoskeletal disorders—posture and work organisation (rest breaks, work
duration, work rates).
2. Explore the characteristics of musculoskeletal symptom reporting within a group of
mouse users with reference to:
(a) Time spent using the mouse
(b) Number of work breaks
(c) Length of work breaks.

The work was carried out as a workplace study in a large Information Technology company
using subjects who were experienced mouse users and who performed normal work activities
for the duration of the study.

Methods

Initial Study
The aim was to gain information regarding musculoskeletal symptom prevalence so a
questionnaire was devised to gain both retrospective information and subjects’ perceptions of
their work activities. This contained questions on general demographics, work organisation
(hours worked at VDU, time spent using mouse, number and length of work breaks) and
focussed on musculoskeletal health symptoms using a modified Standard Nordic
Questionnaire (Kuorinka, 1987).
Responses to the initial questionnaire were matched against selection criteria to choose
healthy (ie. symptom free) subjects for the follow up study. These criteria were

(a) Willing to participate


(b) No reported musculoskeletal symptoms in the previous 12 months or 7 days.
(c) Mouse user
(d) VDU user for >5 hours of the working day.
(e) Mouse used for >50% of VDU tasks.

Initial data analysis indicated that none of the sample matched these criteria therefore (b) and
(e) were modified as follows:

(b) No reported musculoskeletal symptoms in the previous 7 days in the upper limbs only.
(e) Mouse used for >50% VDU tasks or 26–50% of tasks for >or =6–7 hr VDU use.

Follow up study
The purpose of the study was to evaluate the effect of substituting a trackball for a mouse,
therefore objective and subjective measurements were used to detect any changes occurring.
Evaluation of a trackball as an ergonomic intervention 137

RULA (McAtamney and Corlett, 1993), Body Part Discomfort (Corlett and Bishop, 1976),
workplace environment measures (lighting levels, temperature,) and anthropometry
were used.
Each device was used for a 3-week study period and subjects completed one questionnaire
per device type at the end of this time. Questionnaire content was designed to elicit subjective
information regarding fatigue and discomfort and work practices. RULA posture analysis
was undertaken at 4 separate times per subject and device type. Device types used were a
standard IBM 2-button mouse of curved, rounded shape and a Logitech Trackman Marble, 3-
button trackball.

Results
Initial Study
83% (n=29) of the sample (n=34) reported at least one musculoskeletal problem in the
previous 12 months. Prevalence of symptoms by body part was significant for neck, neck and
back and 12 month wrist symptoms from Chi squared test results.
There was little observed difference between percentage mouse use and symptom
frequency and for some body parts (right wrist, neck and upper back) a higher frequency of
symptoms were reported for less than 50% mouse use per day. Similarly, there were no
observed differences between number and length of work breaks taken and symptom
occurrence. (Number and length of work breaks ranged from 0–7 per day and 0–30 minutes
duration respectively.)

Follow up study
For the 6 subjects who participated, subjective feelings of tiredness were compared by
subject and device type and there were little observed differences between device type and
tiredness rating. (Refer to table 1)

Table 1

Subjects rated if their level of tiredness had changed during the study period, (refer to Table
2), but this was not perceived to be related d to device type used.

Table 2
138 B Haward

Change in rate of work pace was subjectively assessed—summarised in Table 3 and again
little differences observed.

Table 3

Physical discomfort ratings also indicated no differences between device type and body
part discomfort, for the group as a whole, although at individual level one subject did
comment that they had less wrist aches using the trackball.
RULA grand score results achieved a score of ‘3’ by all users for both device types except
for one ‘4’ rating. When the upper limb scores alone were analysed, there was greater
variation with a range of ‘3’ to ‘5’ ratings, but this was not significant when subjected to Chi
squared testing. Although some workstation mismatches were apparent from physical
measurements, these were not confirmed by RULA posture observations.
Environmental measurements were taken to identify any factors that could alter subjects’
perceptions of their work environment and hence contribute to psychosocial aspects of the
workplace. Temperature and relative humidity ranged from 20–23Deg C and 35–45%
respectively, within desirable limits for air-conditioned offices and workstation illuminance
varied from 360–900 Lux.

Discussion and Conclusions


Percentage mouse use was not observed to be significant with respect to musculoskeletal
symptom occurrence, which did not concur with previous experience or research findings.
Hagberg (1994) found differences between low (2 hours/week) and high (10 hours/week)
mouse users with regard to shoulder-scapular and hand-wrist-finger symptom prevalence.
Karlqvist (1994) found subjective discomfort ratings higher for keyboard operators than
mouse users for the neck and shoulder region but that mouse users reported higher discomfort
for the forearm/wrist region. The number and length of work breaks also had no apparent
effect on symptom occurrence, but there is a lack of available data to compare this against.
Changing the input device had little observable effect on the way that subjects perceived
and subjectively rated their tiredness, changes in tiredness and body part discomfort. Based
on subjects’ comments it was apparent they perceived tiredness and fatigue to be more related
to work load levels than to device type being used. Greater differences may have been
reported if the trackball had been located in a different position on the desktop to the mouse,
but physical space constraints prevented this. Harvey and Peper (1997) showed a centrally
placed trackball resulted in lower muscle tension levels than the mouse to the right side of the
keyboard. Swanson et al (1997) assessed VDU keyboards and found that different designs
had little impact on subjects’ reports of fatigue and discomfort. This supports the findings of
the current study that workstation hardware changes are not perceived by users as being
related to fatigue and discomfort, but that other work organisation factors are of greater
importance.
Evaluation of a trackball as an ergonomic intervention 139

Changing the input device did not change the subjects work organisation in terms of
number and lengths of work breaks and overtime hours worked. One explanation for this is
that because subjects were musculoskeletal disorder symptom free they had no reason to
modify their work behaviour to cope with any discomfort and pain.
Posture assessments for the study subjects indicate the only difference between the
devices is the observed reduction in ulnar deviation when using the trackball. Previous
studies (Karlqvist, 1994) have found ulnar deviation to be a risk factor for upper limb
discomfort and pain. The observed reduction in ulnar deviation may be beneficial to someone
experiencing ulnar nerve/ulnar region symptoms, but this small study does not provide
sufficient evidence to substantiate this.
From observations and discussions with the follow up subjects it was apparent they
perceived tiredness and fatigue as more related to excessive workloads and long working
hours than to device types being used. They were a highly motivated group, familiar with
high workloads and working to time deadlines, but coped well. This suggests the relevance of
psychosocial factors in the work system, which have emerged as being of importance in the
multifactorial nature of work related musculoskeletal disorders.
It is recommended that trackballs should still have a role in an ergonomics intervention
programme, but as part of a range of intervention measures employed to alleviate work
related musculoskeletal disorder symptoms where both physical and psychosocial attributes
of the workplace are considered.

References
Corlett, EN; Bishop, RB; 1976, A technique for assessing postural discomfort. Ergonomics
19(2), pp 175–182.
Hagberg, M; 1994, The ‘mouse arm’ syndrome-concurrence of musculoskeletal symptoms
and possible Pathogenesis among VDU operators. In GriecoA,. Molteni G, Piccoli, B
and Occhipinti (eds.) Work with display units 94, (Elsevier Science, Holland) pp 381–
385.
Hales, TR; Sauter, SL; Peterson, MR; Fine, LJ; Putz-Anderson, V; Schliefer, LM; Ochs, TT;
Bernard, BP; 1994 Musculoskeletal disorders among visual display terminal users in
a telecommunications company. Ergonomics, 37(10), pp 1603–1621.
Harvey, R; Peper E; 1997, Surface electromyography and mouse use position. Ergonomics
40(8), pp 781–790.
Karlqvist, L; Hagberg, M; Selin, K; 1994.Variation in upper limb posture and movement
during word processing with and without mouse use. Ergonomics 37(7) pp 1261–
1267.
Kuorinka, I; Jonsson, B; Kilbom, A; Vinterberg, H; Biering-Sorenson, F; Andersson, G;
Jorgensen, K. 1987. Standardised Nordic Questionnaire for the analysis of musculo-
Skeletal symptoms. Applied Ergonomics,18(3), pp 233–237.
McAtamney L; Corlett, EN; 1993, RULA—A survey method for the investigation of work
related upper limb disorders. Applied Ergonomics, 24(2) pp 91–99.
Swanson, NG; Galinsky, TL; Cole, LL; Pan, CS; Sauter, SL; 1997, The impact of
keyboard design on comfort and productivity in a text entry task. Applied
Ergonomics, 28(1). pp 9–17.
Old methods new chairs. Evaluating six of the latest
ergonomic chairs for the modern office.

Alan Esnouf and Professor Mark Porter

Sarum Road Hospital Winchester SO22 5HA


Department of Design and Technology Loughborough University
Loughborough LE11 3TU

In a large office complex such as Shell-Mex House where 90% of the work
force are seated at workstations for most of their working day, it is
imperative to ensure that the work force stays as comfortable as possible.
Using a variety of evaluative methods this project evaluated six of the latest
chairs on the market to recommend a suitable chair for the office staff of
Shell-Mex House. The relative merits of long and short term evaluations
were also compared. One chair, 6 (Sedus Paris) was rated as the most
preferred, it produced the least discomfort in all eleven body parts of the
Corlett questionnaire, (Corlett and Bishop 1976), and was also rated as the
most comfortable in the paired comparison

Introduction
The work on chair comfort was started in earnest in the forties and fifties with work by
Akerblomb (1948) and Keegan (1953), and the momentum has continued ever since. The
methodology to date is very varied but basically fits into two categories; subjective or
objective, and can be further subdivided into field and laboratory studies. The research so far
indicates that there is no one definitive method for predicting chair comfort, therefore this
project employed a combination of subjective field studies and objective laboratory studies,
namely paired comparisons, field evaluation study and a pressure distribution study. The
definition of comfort used in this project was that used by Shackel et al, (1969) and others,
their definition of comfort was a lack of discomfort.

Aims
To establish the extent to which short term trials can be used as a replacement for longer trials
(a full day) when selecting a suitable visual display unit (VDU) user chair. A second aim of
the project was to select a comfortable chair for VDU users in Shell-Mex House
Evaluating six ergonomic chairs for the modern office 141

The body part questionnaires produced similar results, chair 6 (Sedus) showed the least
amount of discomfort. In the four body segments, upper arms, legs, lower arms and
shoulders, none of the subjects recorded any discomfort at the end of their working day in
chair 6 (Sedus). At least 50% of the subjects recorded the least amount of discomfort in all 11
body segments in the Corlett questionnaire.
Table 1 shows Chair 6 the Sedus to be rated the most preferred chair with the least
recorded percentage of discomfort in all 11 body segments

Table 1. Order of preference from analysis of Corlett questionnaire.

Discussion
It is interesting to note that despite the many similarities between chair 1 (Simultan) and chair
6 (Sedus), chair 6 was preferred to chair 1 in every perameter evaluated. The major difference
identified from the chair feature checklist was that 10 of the subjects considered the seatpan
cushion too hard or much too hard in chair 1 the Simultan. Also of note was that in this study the
upper back was the area of most discomfort. It may be that with the increased use of computers
this is a trend that may increase. The validity of the comfort/discomfort measures were well
proven by means of the two versions of the Corlett questionnaire the Shackel comfort scale and
a comfort/discomfort question in the chair feature checklist producing remarkably similar results.

Conclusion
From the literature it is evident that expert opinion, standards and recommendations are not a
reliable method for predicting chair comfort, and can only be used as a start point when
attempting to select a comfortable chair. This study has shown that long term user trials is a
viable method of selecting a chair for the modem office. It is also true to say that no one chair
will ever suit everyone so it is important to have alternatives available. The inclusion of a
chair feature checklist helped to provide information on what made one chair more
comfortable when compared to another.

Seat Pressure Distribution Study


Three subjects sat in the same six chairs as used in the previous studies described above. One
female 151cm stature (10th percentile) weight 45kg, one male 175cm stature (50th
percentile) weight 75kg, one male 190cm stature (95th percentile) weight 93kg.
142 A Esnouf and JM Porter

Paired Comparisons study


In the paired comparisons study 48 male and 52 female subjects took part, their weight
ranged from 1071bs/48.6ks lightest female to 1911bs/90ks heaviest male. Stature ranged
from a 4ft 1 lins/1500cms female to a 6ft 5ins/195cms male. Age range was 23 to 55
years old.
All six chairs were aesthetically similar and a similar shade of blue despite having a
variety of different ergonomic features. Chair 6 and 1 the Sedus and Simultan were fully
synchronised and dynamic. Chair 3 4 and 5 the Criterion Dauphin and the Sitrite had dynamic
backrests and tilting seatpans. Chair 2 the Ahrand had the addition of adjustable seatpan
depth.

Procedure.
The trials were conducted in rooms very similar to that of the subjects workplace. Each
subject was given the same information on the trial and a checklist on which to record their
preferences. The checklist included all possible pairings of the six chairs in an order that no
chair was paired in succession. Subjects started at different places on the checklist so that no
comparisons were always made at the beginning or end of the trial. They were given a few
minutes on each chair to adjust it to their liking, and were instructed to make their selection
quickly based on their initial sensations in the chair. They had to record their preferred
cushion, backrest and whole chair. Each trial took approximately 45 minutes per subject.

Results
Using the method of paired comparison, Guildford (1954), chair 6 the Sedus Paris was
clearly the most preferred chair in all three parameters, seat cushion, backrest and whole seat,
chair 1 (Simultan) was the second most preferred chair.
Figure 1 shows the scale value for each chair. The zero value was set to the least preferred
chair and is not absolute, but the scale values do allow a relative assessment of the six chairs
as the scale is at least linear.

Figure 1. Scale value from paired comparisons


Discussion
When analysing the data there was no noticeable relationship between gender, stature or
weight. This could well indicate that features of each chair played a major part in subject
Evaluating six ergonomic chairs for the modern office 143

selection. The methodology of paired comparisons however is not precise enough to be


categorical in making such a statement. The only chair with a specifically shaped lumbar
support was rated as the least preferred chair. The first and second most preferred chairs were
very similar in nearly all respects and were both synchronised dynamic chairs.
The method of paired comparison can be successfully used in identifying the most
preferred chair from a group of chairs. The methodology however does leave many
unanswered questions, and does not tell us why a particular chair was the most preferred, or
what aspects of the chair made it the most preferred.

Field Survey
In the field survey 30 subjects reflecting the user population of Shell-Mex House, including
data processors, word processors and interactive users took part. There were 15 females
ranging in stature from 1525cm to 1700cm (10th to 90th percentile), (Pheasant, 1990),
weighing from 44.5kg to 85kg, age range 25 to 55. The 15 males ranged from 1550cm to
1900cm (10th to 95th percentile), weighing from 55kg to 90kg, age range 25 to 55. All of the
subjects spent the majority of their working day at their workstations.

Procedure
The same six chairs were used as in the paired comparison trial. The subjects used each chair
for one full day on Tuesday, Wednesday and Thursday over two weeks to reduce some of the
effects of Monday freshness and Friday staleness. The use of the chairs were balanced to
ensure that the chairs were not evaluated in the same order throughout the trials. Subjects
were instructed on chair adjustment and left instructions should more adjustments be
required. Once subjects had got themselves settled with the chair adjusted to their liking they
filled in two Corlett comfort questionnaires, a mood scale, and Shackel’s 11 point comfort
scale. Shortly before the end of the working day subjects filled in a second copy of the above
questionnaires plus a chair feature checklist.

Results
The shackel scale showed only 6% of subjects felt uncomfortable in chair 6 (Sedus)
by the end of the day compared to 50% in chair 3 (Criterion) figure 2.

Figure 2. Percentage of subjects recording discomfort on the Shackel scale.


144 A Esnouf and JM Porter

Procedure
The pressure study was conducted using a Tally Mk 3 pressure monitor with a seatpan matrix
of 140 pressure reading points and a matrix of 40 points for the backrest. The chairs where
adjusted appropriately for each subject, with the seatpan and backrest fixed to eliminate
variations in subjects positions, and subjects rested their arms lightly on the arm rests. The
pressure readings where then computated into pressure charts and colour contour maps to a
give a better visual appreciation of the pressure readings.

Results
The data was given an eyeball analysis described by Reinecke et al (1986). The pressure
readings correlated with each subjects size and weight indicating that the data had been
recorded correctly. From the data no one chair significantly reduced seat pressure but the
Sedus did distribute the pressure more evenly, particularly in the heaviest subject.

Discussion
In this study static seat pressures were measured and in keeping with previous studies the
highest recorded pressures were over the ischeal tuberosities. The results from this study
agree with the findings of both the subjective evaluations of the six chairs. The Sedus chair
showed slightly less pressure overall, and also better distribution of pressure with less of a hot
spot under the ischeal tuberosities and the sacral area.

Project discussion/conclusions
The paired comparison was a quick and easy method of evaluating chair comfort. However it
gave no information as to why the Sedus was the most comfortable chair of the six evaluated.
The short term study took each member of staff away from their workplace for approximately
45 minutes. The long term study took each member of staff away from their work for no more
than 20 minutes and at the same time gave a huge amount of useful information on subjects
preferences in chair comfort, areas of discomfort and features of each chair which increased
or decreased their comfort. The time involved in the long term study was that of the
individual responsible for the analysis of the collected data, the time cost for the paired
comparison was more than twice that required for the long term study.
There are literally hundreds of chairs on the market today described as ergonomic,
unfortunately this does not mean comfortable. The methodology used in this study enables
companies to involve their employees in selecting a comfortable chair for their use in an
inexpensive way and ensures a truly ergonomic approach.

References
Aakerblomb, B. 1948 Standing and sitting posture with special reference to the construction
of chairs. Doctorial dissertation. Nordiska Bokhandeln, Stockholm.
Keegan, J.J. 1953 Alterations of the lumbar curve related to posture and seating. Journal of
Bone and Joint Surgery, 35–A, 589–603.
Shackel, B. Chidsey, K.D. and Shipley, P. The assessment of chair comfort. Ergonomics,
1969, vol. 12, no. 2, 269–306
Guildford, J.P. 1954 Psychometric Methods. London: McGraw-Hill Book Co.
Pheasant, S.T. 1990. Anthropometrics-An Introduction, 2nd edition (London: B.S.I.)
Corlett, E.N. Bishop, R.P. 1976, Ergonomics, 19, 175–182. Assessing postural discomfort
NEW TECHNOLOGY
DEVELOPMENT OF A QUESTIONNAIRE TO MEASURE
ATTITUDES TOWARDS VIRTUAL REALITY

Sarah Nichols

Virtual Reality Applications Research Team (VIRART)


Department of Manufacturing Engineering and Operations Management
University of Nottingham, University Park
Nottingham, NG7 2RD
epxscn@epn1.maneng.nottingham.ac.uk

The attitude of users towards Virtual Reality (VR) may influence their
experience of effects of VR use. This paper describes the development of a
questionnaire to measure potential users’ attitudes towards VR. Responses
from 167 questionnaires from staff and students at the University of
Nottingham were analysed. A principle components analysis revealed eight
factors that comprised an overall attitude to VR, including awareness,
perceived usefulness, perceived desirability of VR and concerns about health
effects of VR use. Experience of VR was found to be positively correlated
with some aspects of attitude to VR, and men were found to generally have a
more positive attitude to VR.

Introduction
As Virtual Reality (VR) technology has developed after the last few years, it has become
apparent that there are a number of effects, both negative and positive, that result from VR
use. These effects include a feeling of presence (or “being there”) in a Virtual Environment
(VE) (the “world” simulated by VR technology) and negative effects such as the experience
of symptoms akin to motion sickness either during or after VR use.
The influential factors on these effects have previously been classified as being associated
with the VR system (e.g. headset design, resolution of visual display), VE type and design
(e.g. number and type of objects, amount of interactivity afforded), circumstances of use (e.g.
length of period of VR use) and individual user characteristics (Nichols et al., 1997). Many
individual user characteristics may influence the effects experienced after VR use, including
gender, visual characteristics or previous experience of motion sickness after travelling in
cars, boats or planes, or experience of sickness after simulator use. However, the individual
characteristic of interest here is the characteristic of attitude of users towards VR.
Questionnaires to measure attitudes towards computers have been found to comprise of
several contributing factors such as computer anxiety, enjoyment, satisfaction, perceived
usefulness and experience (Loyd & Gressard, 1984; Igbaria et al., 1994). Attitudes vary
amongst different areas of the population, make a contribution to effectiveness of computer
use in the workplace and can be influenced by training (Torkzadeh & Koufteros, 1993).
Measurement of attitudes towards virtual reality 147

Therefore if VR is to be implemented as a workplace tool it is useful to examine attitudes


of users before VR use. A more positive attitude should lead to a more effective and enjoyable
use of VR as a workplace or educational tool. If this relationship is found to exist, training
programmes for VR use should have the improvement of VR user attitudes as a primary aim.
However, there are aspects of VR which mean that a Computer Attitudes Scale cannot
necessarily be used to measure attitudes towards VR. These aspects include the three
dimensional, interactive and user-centred nature of Virtual Environments (VEs), the large
amount of physical freedom available to the participant and the idea that a participant
experiences a sense of presence in a VE.
This paper presents a questionnaire developed to measure potential and existing users’
attitudes towards VR. The questions consider a number of aspects of VR use, including
whether VR is thought to be a useful technology for work and leisure, whether people would
be willing to use VR in their workplace, whether VR systems should include a headset, if
there are any health issues perceived to be associated with VR use and whether VR use
provides the opportunity for social interaction.
In order for a questionnaire to be used as a statistical tool which can be used to provide
parametric data a number of statistical procedures must be performed. This analysis also
allows the identification of component factors of an overall attitude towards VR. In addition,
the relationship between VR attitudes, experience of VR, age and gender are examined.

Method

Initial VR Attitudes Questionnaire Design


An initial version of the questionnaire was developed. The items included were derived from
the author’s knowledge of computer attitude questionnaires and experience with participants
in VE experiments over the last 2 1/2 years. 45 items were produced. A short introduction to
Virtual Reality and Virtual Environments was included with the questionnaire so that people
who had not previously seen or used VR would still be able to complete the questionnaire.
The scale used in the questionnaire was a five point Likert scale. In addition, for purposes
of questionnaire development, respondents were given the additional option to select a “don’t
understand” option, in order to identify those questions which were badly worded or
confusing to those with low experience of VR, and to avoid a high central tendency.

Participants
The participant sample consisted mainly of students and staff from the University of
Nottingham who either completed the questionnaire during a lecture or were approached by
the author and asked to return the completed questionnaire via internal mail.
284 questionnaires were distributed in total. 167 completed questionnaires were received
(94 (56.3%) male, 73 (43.7%) female; response rate=58.8%). The mean age of respondents
was 20 yrs 11 months (range=18–36 yrs, SD=3 yrs 1 month). The sample also included a
wide spread of people with different levels of experience with technology in general and VR
in particular for the purposes of test construction. As the development process continues and
the questionnaire is used in experiments it is hoped that this population will be expanded to
include other potential users of VR such as a military population.
148 S Nichols

Questionnaire Analysis

Scoring
Initially, the appropriate direction of scoring for each questionnaire items was estimated by
observation. A high score indicated a positive attitude. If there was any doubt about the
appropriate direction of scoring then the items were initially classified as “neutral” and later
the direction of scoring was determined statistically (see “Item Analysis”). 16 questions were
scored as “positive”, 15 as “negative” and fourteen as “neutral”.
Questionnaires where more than five responses of “don’t understand” were given were
eliminated from further statistical analysis. It was felt that the responses on these questionnaires
were unreliable and not suitable for inclusion in the factor analysis. This resulted in the elimination
of data from eight respondents. Further examination of these respondents revealed that all but
one had a low level of experience or knowledge of VR. The remaining respondent was in fact a
VR programmer, which obviously resulted in him having a very high level of VR knowledge
and experience. However, this respondent spoke English as a second language. Although this
data is not suitable for statistical analysis at this stage, it is intended that this questionnaire
should be suitable for use by people of all levels of VR experience and should be understandable
by speakers of English as a second language, therefore this information is used in further
examination of question appropriateness and wording.

Item Analysis
Several types of item analysis were carried out. Firstly, the responses of “don’t understand”
were examined by question. This revealed that 13 questions had more than 2.5% of
respondents not understanding the question, 3 of which were not understood by more than
5% of respondents. No elimination of questions was made on the basis of this data alone;
however, it was taken into consideration when further item analysis was completed.
After the “don’t understand” responses had been examined the data from the eight respondents
were eliminated and the remaining responses of “don’t understand” were converted into neutral
(3 points for positive and negative worded questions) for all further analysis.
At this stage the direction of the items previously classified as “neutral” items were assigned
direction according to the direction of correlation with the total of all items to which a direction
had been previously assigned. As a result of this four more questions were positively scored and
four items negatively scored. After this analysis one question was eliminated due to having a
very high central tendency and having had six responses of “don’t understand”.
After reliability analysis was performed, items which had a value of Cronbach’s Alpha
<0.15 were eliminated. As a result of this, the five remaining neutral items were eliminated.
This resulted in no neutral items remaining, and a final value of Cronbach’s Alpha of 0.886.

Factor Analysis
A Principal Components Analysis was performed with Orthogonal Varimax rotation (see
Table 1) after it was confirmed that the data met the criteria required for a factor analysis to
be performed (Ferguson & Cox, 1993). The criterion of assignment of items to the eight
factors extracted was a loading of at least 0.40. The factors were named after four
independent observers were shown the items grouped according to their factor loadings and
Measurement of attitudes towards virtual reality 149

Table 1. Factor loadings of questionnaire items

asked to assign titles to each factor. These four observers then discussed the titles they had
thought of, and agreed on a suitable final name.

Relationship between individual characteristics and attitudes


There were no significant correlations between age and either total VR attitude or any of the
subscale totals. VR experience was found to be correlated with overall attitude to VR
150 S Nichols

(r s=0.349; p<0.001), VR Desirability (r s=0.301; p<0.001), VR Awareness (r s=0.457;


p<0.001), VR usefulness (rs=0.357; p<0.001) and Enthusiasm for VR (rs=0.354; p<0.001).
Independent samples t-tests showed that responses of men and women differed in overall
attitude (t=4.27; df=157; p<0.001), VR health anxiety (t=3.23; sf=157; p<0.001), VR
Desirability (t=2.43; df=157; p<0.02), VR Awareness (t=4.78; df=157; p<0.001) and
Enthusiasm for VR (t=3.28; df=157; p<0.001). In all cases men were found to have a more
positive attitude than women.

Conclusions
This preliminary analysis has shown that attitude towards VR is comprised of a number of
factors, including awareness of VR capability and use, perception of how useful VR is,
anxiety towards VR use and general enthusiasm for using VR. Previous experience or
knowledge of VR was found to be strongly correlated with five of the factors and the overall
attitude, indicating that experience generally has a positive effect on improving attitude.
This analysis has succeeded in shortening the initial questionnaire. However, it may be the
case that there is still some repetition in the questions, and further items could be removed.
This could also result in the removal of some questions which were commented on as being
ambiguous, such as those which did not distinguish between headset and desktop VR
systems, and confusions between VR in general and VEs. It may also be that confusions such
as these would be eliminated once general public knowledge about VR technology is
increased, although actual differences in attitudes towards different types of systems may
exist; if this is the case then this issue must be examined in detail.
Finally it is important to note that there may be different requirements of a VR attitudes
questionnaire from different types of users. One of the main benefits of VR technology is that
it is not restricted to use by English speaking, able-bodied workers in higher education
systems. The development of further versions of this questionnaire is proposed, where
question wording is improved for speakers of English as a second language, shorter versions
of questionnaires are developed for use with children, and single questions from each factor
are used in a symbolic form for assessment of attitudes of people with learning disabilities.

References
Ferguson, E. & Cox, T. (1993) Exploratory Factor Analysis: A Users’ Guide. International
Journal of Selection and Assessment, 1(2), 84–94.
Igbaria, M., Schiffman, S.J. & Wieckowski, T.J. (1994) The respective roles of perceived
usefulness and perceived fun in the acceptance of microcomputer technology.
Behaviour & Information Technology, 13(6), 349–361.
Loyd, B. & Gressard, C. (1984) Reliability and factorial validity of computer attitude scales.
Educational and Psychological Measurement, 44, 501–505.
Nichols, S., Cobb, S. & Wilson, J.R. (1997) Health and Safety Implications of Virtual
Environments: Measurement Issues. Presence: Teleoperators and Virtual
Environments, 6(6).
Torkzadeh, G. & Koufteros, X. (1993) Computer user training and attitudes: a study of
business undergraduates. Behaviour & Information Technology, 12(5), 284–292.
ORIENTATION OF BLIND USERS ON THE WORLD
WIDE WEB

Mary Zajicek, Chris Powell, Chris Reeves*

The Speech Project, School of Computing and Mathematical Sciences


Oxford Brookes University, Gipsy Lane, Oxford OX3 OBP, UK
Tel: +44 1865 484683, Fax: +44 1865 483666
Email: mzajicek@brookes.ac.uk

*Royal National Institute for the Blind


224, Great Portland Street, London WIN 6AA

The aim of our work is to make the wealth of information on the World Wide
Web more readily available to blind people. We wish to enable them to make
quick and effective decisions about the usefulness of pages they retrieve. We
have built a prototype application called BrookesTalk which we believe
addresses this need more fully than other Web browsers. Information
retrieval techniques based on word and phrase frequency are used to provide
a set of complementary options which summarise a Web page and enable
rapid decisions about its usefulness.

Introduction
This paper describes the results of evaluation of BrookesTalk, a web browser for blind users
developed at the Speech Project at Oxford Brookes University. The aim was to evaluate the
utility of the multi-function virtual menubar provided by BrookesTalk with blind users
including those based at the Royal National Institute for the Blind.
The aim of the project is to provide a tool which will enable blind users to ‘scan’ web
pages in the way sighted users do (Zajicek and Powell, 1997a) and find the useful information
that is out there. BrookesTalk is designed to extract an information rich abbreviated form of a
web page and present it using speech so that the blind user can make a quick decision as to
whether the page will be useful or not. It also provides a means by which blind users can store
particular sentences and mark places as they move around the web.

BrookesTalk Described
BrookesTalk is a small speech output browser which is independent of conventional browsers
and also independent of text to speech software applications using Microsoft speech
technology.
It includes the functionality of a standard Web browser for the blind (Zajicek and Powell,
1997b) such as pwWebSpeak(TM) in that it can break up the text part of a Web page into headings
and links and read out paragraphs etc. However the main aim is to provide an orientation tool
152 MP Zajicek, C Powell and C Reeves

for blind users in the form of a virtual toolbar of functions that will provide different synopses
of a Web page to help the user decide whether it will be useful to them or not.
Users can select from a from a menu of, list of headings, list of links, list of keywords, list
of bookmarks, an abridged version of the page, a list of scratchpad entries, a summary of the
page, and can also reach and read out chunks of text which are organised hierarchically under
headings. It is expected that the user will pick tools from this virtual toolbar which
complement one another for the particular type of page under review.
The list of keywords consists of words which are assumed (Luhn, 1958) to be particularly
meaningful within the text. These are found using standard information retrieval techniques
based on word frequency. Abridged text is also compiled of special sentences which have
been isolated using trigrams.
The scratchpad allows users to send any sentence they are listening to, which they
consider important or worth noting, to the scratchpad simply by pressing a key. They can then
playback lists of sentences linked to particular pages.
The summary of the page includes author defined keywords, the number of words in a
page, the number of headings and the number of links.

Keywords and Abridged Text Explained

Keywords
The following examples of keyword extractions were found for three different Web pages
shown in Table 1.
Table 1. Examples of Keyword Extractions

An evaluation of the usefulness of keywords compared with headings or links is given in the
next section.

Abridged Text
When keywords are used contextual information imparted by the position of a word in a
sentence is lost. Extraction of three word key phrases or trigrams preserves some word
position information. The technique is based on ‘Word level n-gram analysis’ in automatic
document summarisation (Rose and Wyard, 1997) To provide a measure of similarity, groups
of words appearing together rather than individual words are compared.
Orientation of blind users on the World Wide Eeb 153

To reduce the number of word-level mismatches due to the normal changes in spelling
required by grammar, each element of a trigram was assigned the stem of a word rather than
the word itself. The trigrams presented were ranked by frequency.

Table 2

The RNIB Web page ‘Equal Opportunities Policy’, a 238 word document, gave the five
trigrams of frequency >1 shown in Table 2. High frequency trigrams occur twice, low
frequency trigrams once, providing at first glance little to distinguish between them. Many of
the words in the trigrams are noise words which are required for grammatical correctness and
are not content bearing.
A summation of frequency of trigram, number of content words in the trigram and number
of keywords in the trigram appears as the score for the trigram in Table 2. which also shows
key trigrams for the ‘Equal Opportunities Policy’ page.
Abridged pages were created by computing the key trigrams of a page, according to score,
and then creating a page consisting of the sentences in which the trigrams appeared. Abridged
pages on average worked out to be 20% of the size of the original text and, unlike keyword
lists, are composed of well formed (comprehensible) sentences.

Evaluation
Keyword Evaluation
Preliminary experiments were performed to assess the usefulness of the keyword list as an
indicator of page content compared to the headings list or the links list. Headings and
keywords in particular were judged to be roughly comparable in that they provided a list of
indicating words or short phrases.
The argument for incorporating keywords in the BrookesTalk menubar is that it provides
more flexibility for the user in summarising the Web page. If the author has truly
encapsulated the meaning of subsections of the page in headings then headings should
provide a significantly better indicator of page content than keywords. However headings are
often represented as images which do not provide speech output, or are eye-catching rather
than informative. In this case keywords could provide a better summary. The aim of
BrookesTalk is to provide flexibility with a range of tools to aid orientation which can
override many of the vagaries of Web page authoring.
154 MP Zajicek, C Powell and C Reeves

User’s perception of the usefulness of the representation was measured by asking them to
evaluate the usefulness of describing a Web page using the three different types of summary
representations, headings, links/anchors, and keywords.
Twenty subject users were shown the different representations for six different Web pages.
The pages were chosen to maximise variability. Subjects gave a score between 0 and 5 for
each representation. The sum of the scores for each representation, together with the
percentage of the total score it represented, was taken to give an indication of its
effectiveness, results are shown in Table 3.
We see that users perceived that keywords provide a considerable improvement on the use
of links to orientate users to Web pages. Headings gave the best score but the score for
keywords was not significantly different.

Table 3. Scores for different representations

Usability And The BrookesTalk Environment


The prototype BrookesTalk was used by a group blind users including those at the Royal
National Institute for the Blind (RNIB). User acceptance was no problem as this group were
committed to finding out what software is available for blind people. They were all
technically able although our ultimate goal is to develop software for non technical users so
that all blind people can use the Web
Earlier versions of BrookesTalk required TextAssist software for the speech synthesis.
This often required patching in Windows’95 and caused discouraging technical
complications before getting started. BrookesTalk was then re-written to run on the Microsoft
Speech engine for Windows’95. While increasing portability the speech engine currently uses
a lot of precious disc space. Both versions are available.
BrookesTalk uses different voices for conceptually different parts of a Web page. This was
appreciated by most but described as irritating by one. We plan to make different voices
optional in the future.
BrookesTalk runs without any visual display at all and does not run with another browser
open. Users felt it would be useful to have the visual equivalent of the spoken page available
at the same time so that sighted co-workers could be called in for clarification or work co-
operatively with the blind user.

Evaluating The Functionality Of BrookesTalk


Users were observed to rely heavily on one function rather than move between different
summarising representations. They had been encouraged to try using the different functions to
complement one another as they provide different views of the page. Users said that they usually
know what type of page they are searching for, research work, entertainment, product details etc
They therefore know how useful headings are likely to be and can use keywords accordingly.
Orientation of blind users on the World Wide Eeb 155

Surprisingly one user orientated himself by using the movement between links key 90% of
the time. We had not anticipated that he would build his conceptual model of the page by
looking at what was behind it. This approach will be investigated fully!
The abridged version of the page received most criticism. The trigram analysis could
easily pick out the wrong trigrams as being significant and important headings were
frequently left out of the summary. The algorithm for picking trigrams is not very stable it can
easily be influenced by irrelevant words. It was suggested that trigrams should carry some
kind of semantic weighting if they appear in the title or headings.
The scratchpad worked well and provided an easy way of saving important sentences from
the page. As yet sentences are linked to pages and a new scratchpad must be started with each
new page. Users suggested that sentences could be tagged as related to search themes. In this
way sentences from several different pages could be grouped by theme.

Conclusion
Initial evaluation highlighted the potential usefulness of features designed to improve
navigation, such as the use of keywords and page summary. Blind users emphasised the
potential of a tool such as BrookesTalk to sort through what they referred to as the ‘increasing
pile of paper that arrived on their desks’ during the working day. Methods used to translate
HTML formatting to speech could easily be applied to other formatted documents.
They felt that the product is fairly accessible for this stage of its development; but some
problems exist with accessing links. Useful aspects included vocal notification of HTML
markup (i.e. if text is a heading, alt-text etc.) and being able to move between headings and
start speech from specified points in the text. By incorporating usability at such an early stage
in the development process the system is more likely to meet user needs, with the next step
being a more structured approach to obtaining feedback. This will allow faster
communication of problems and improvements, ensure all functionality is assessed and
highlight potential new areas of development.

References
Luhn, H.P. 1988, The automatic creation of literature abstracts. IBM Journal of Research
and Development, 2, 159–165.
Rose, T., and Wyard, R., 1997, A Similarity-Based Agent for Internet Searching.
Proceedings of RIAO’97.
Zajicek M., Powell C., 1997a, ‘Building a conceptual model of the World Wide Web for
visually impaired users’, Contemporary Ergonomics 1997, (Taylor and Francis,
London), 270–275
Zajicek M., Powell C., 1997b, ‘Enabling Visually Impaired People to Use the Internet’, IEE
Colloquium ‘Computers helping people in the service of mankind’, London
“FLASH, SPLASH & CRASH”: HUMAN FACTORS AND THE
IMPLEMENTATION OF INNOVATIVE WEB TECHNOLOGIES

Adam Pallant and Graeme Rainbird

RM Consulting, Royal Mail Technology Centre,


Wheatstone Road, Dorcan,
Swindon SN3 4RD

email to <zap@factor.netkonect.co.uk>

This paper is based upon the results of a heuristic evaluation of early design
concepts for a new Royal Mail Web site. The design concepts consisted of
static page mock-ups incorporating innovative Web technologies as
implemented in existing Web sites. The acceptability of these design features
in light of the particular constraints imposed by the Web is discussed. It is
argued that reliance on innovative technologies is likely to exclude a
proportion of potential customers. The paper concludes that Web designers
should strive for simplicity rather than innovation, thereby making their sites
accessible to the widest possible audience while still fostering positive
customer perceptions.

Introduction
The World Wide Web (‘Web’) is increasingly regarded by corporations as an important point
of contact with customers, as well as a potentially valuable sales channel. Royal Mail has had
a presence on the Web since 1995, providing customers with information about its range of
products and services. This original site was seen as outdated, however, and new designs
incorporating innovative Web technologies were being considered.

The principal business objectives of the new site were:

– To allow customers to learn about and purchase Royal Mail products and
services.
– To exploit the opportunities of the Web and provide new ways for customers
around the world to interact with Royal Mail.
– To foster positive customer perceptions about Royal Mail.

Human factors involvement was sought to evaluate the design concepts for the new Royal
Mail site. The principal objective of the evaluation was to assess the ‘acceptability’ of the
innovative technologies against established usability principles and design guidelines.
Implementation of innovative Web technologies 157

Design Concepts
The concepts proposed by the design team consisted of a series of static, non-functional page
mock-ups illustrating the proposed ‘look and feel’ of the site (see Figure 1 below for an
example). These designs were based upon the use of innovative Web technologies already
implemented in a sample of existing Web sites.

Figure 1. Mock-up of Web page for new Royal Mail site

The key features of the design concepts included:

– The display of an animated ‘Splash’ screen welcoming visitors to the site but
providing no content in its own right (somewhat like a book cover).
– The use of horizontal ‘Channels’ and vertical ‘Services’ toolbar structures to
support navigation around the site.
– The use of the ‘Flash’ plug-in extending the functionality of the Web browser by
supporting the use of animated graphics, including ‘mouse-over’ animations and
text fields, within the site.
– The use of ‘frames’, or separate page areas, housing distinct content and
navigation functions.
– The display of page content in additional fixed width ‘pop-up’ Web browser
windows.

A heuristic evaluation of existing sites was performed to assess the potential impact of these
design features on the usability of the proposed Royal Mail site.
158 A Pallant and G Rainbird

Usability
The usability of each of a sample-of existing Web sites was evaluated with particular regard
to the implementation of the key design features described above. Issues were classified as
either enhancing [key: ✓], or reducing [✕] the usability of each site. A sample of the issues
identified for each design feature are listed in Table 1. below.

Table 1. Usability issues identified during heuristic evaluation of sample Web sites
[key: ✓=enhanced usability; ✕=reduced usability]

The implementation of each design feature had a dramatic impact upon the usability of the
sample sites. The results of the heuristic evaluation were therefore used to inform a list of
recommendations intended to guide designers in successfully implementing these features in
the Royal Mail site.

The usability of the design features was also likely to depend upon the specific context of use.
Users with relatively little Web experience, for example, were more likely to be confused by
pop-up browser windows and mouse-over animation. Indeed, whatever their potential
usability benefits, it was suggested that reliance on innovative Web technologies risked
excluding a proportion of the potential user population.
Implementation of innovative Web technologies 159

Accessibility
The major constraint of Web design is the variability of the potential task context and hence the
difficulty in specifying user requirements. It is often impossible to predict who will visit a site,
why they will go, and how they will get there (not to mention what they will do once there).

The Royal Mail site was intended to be accessible to the widest possible audience, whatever
their user characteristics (e.g. philatelists vs. Royal Mail employees), task requirements (e.g.
‘browsing’ vs. ‘searching’), or platform (e.g. hardware, software or system settings). It was
argued that the difficulties inherent in making the site universally accessible were likely to be
exacerbated by reliance on innovative technologies.

For example, potential Royal Mail customers may not have had access to plug-in technology
such as Flash for any of a number of reasons:

– They may have been unwilling to download the plug-in just to view the contents
of the site, particularly if they were using slow modem connections.
– They may have been viewing the site with a text only browser, or have disabled
graphics in their browser settings, and consequently have had no requirement for
‘animated graphics’.
– They may have experienced difficulties installing the plug-in. The installation
process can be complicated, particularly when it starts to go ‘wrong’, as the
following on-line ‘troubleshooting’ guide exert illustrates: “..If you can’t find the
file NPSPL32.DLL, click on “Start”. Select “Find…Files or Folders”. Under
“Name & Location”, enter “NPSPL32.DLL” in the Named field. Make certain
you have selected the upper root of your hard drive (C: in most cases)…”
– Their browser settings may have prevented them from installing the plug-in. Having
‘Safety Settings’ set to ‘High’ in Internet Explorer 3.02 (the default setting), for
example, prevents Flash from being downloaded. Similarly, having ‘JavaScript’
disabled in Netscape 2.x interferes with the installation of the plug-in.
– They may have been using a platform which did not support the plug-in. For
example, Mac users must have installed Netscape 3.x or above to use Flash.
– Their server ‘firewall’ may have prevented them from downloading Flash.
– They may have been unable to find the plug-in. Several sites, for example,
directed users to a ‘plug-in directory’ page, leaving them to search for Flash from
among the dozens of available plug-ins.

Not only could reliance on innovative technologies exclude potential customers, it could cause
severe problems for customers who do manage to access the site content. The latest technologies
are often unstable, unpredictable and inadequately tested. Downloading plug-ins, opening
additional browser windows, running animations and rendering frames are all prone to draining
system resources and crashing even (or especially) the most up-to-date browsers.

As one commentator notes: “The problem with ‘bleeding edge’ technology is that the blood
on the floor ends up being yours.” (Bystrom, 1996).
160 A Pallant and G Rainbird

Conclusion
Innovative technologies have the potential to enhance Web site usability. The ‘mouse-over’
animations enabled by Flash, for example, can support users in identifying ‘clickable’ page
areas (or links). While meeting the expectations of some customers, however, Web designs
relying on innovative technologies are likely to exclude others.

One approach to resolving the tension between innovation and accessibility is


configurability; tailoring a customer’s experience of a site to suit their particular
requirements. This approach has the advantage of allowing the site to exploit the latest
technologies, while still leaving the content accessible to a wide range of potential customers.
Such ‘configuration’ can be achieved in one of two ways:

– Automatically: by having the server determine the task context and act
accordingly (‘server-push’). For example, the server can detect which browser a
customer is using and send only data appropriate to that platform.
– Manually: by allowing the customer to select the version most appropriate to their
requirements (‘client-pull’). For example, customers can be presented with
options such as whether to view the site with or without frames, graphics,
proprietary plug-ins etc.

Innovative technologies such as Flash, however, are essentially ‘presentational tools and as
such stand at odds with the essential value of the Web as a universally accessible information
system. Indeed, the original design of the Web and its underlying language (HTML) was
based on encoding the meaning of information rather than its presentation.

An alternative approach to Web design is simplicity; avoiding innovative technologies and


making the site content accessible to the widest possible audience. Simple sites are generally
easier to use, more stable, less error-prone, more broadly compatible and easier to maintain.

Royal Mail subsequently determined that the use of innovative technology in their Web site was
inappropriate and inconsistent with customer perceptions of the corporate brand as a ‘reliable’,
‘solid’ and ‘traditional’. The new site can be seen at <http://www.royalmail.co.uk>.

References
Bevan, N. (1997) “Usability Issues in Web Site Design.” in National Physical Laboratories
site at <http://npl.co.uk/npl/sections/us/frames/fweb.html>.
Bystrom, C. (1996) “ThreeToad Browser Tips.” in ThreeToad MultiMedia site at <http://
www.threetoad.com/main/Tips.html>.
Flanders, V. (1996) “Web Pages That Suck.” at <http://www.webpagesthatsuck.com/>
Levine, R. (1995) “Sun Guide to Web Style.” in Sun On the Net site at <http://www.sun.com/
styleguide/>.
Nielson, J. (1997) “The Alertbox: Current Issues in Web Usability.” at <http://
www.useit.com/alertbox/>.
Quinn, L. (1997) “Why Write Accessible Pages?” in Web Design Group site at <http://
www.htmlhelp.com/design/accessibility/why.html>.
Sullivan, T. (1997) “All Things Web.” at <http://www.pantos.org/atw/>.
WORK STRESS
DETERMINING ERGONOMIC FACTORS IN STRESS
FROM WORK DEMANDS OF NURSES

DW Jamieson, RJ Graves

Department of Environmental & Occupational Medicine


University Medical School University of Aberdeen
Foresterhill Aberdeen AB25 2ZD

A number of authors have suggested that ergonomics may be of considerable


use for tackling the root cause of work related stress by identifying and
eliminating sources of stress from the workplace. Evidence from a number
of sources suggests that nurses may be one of the groups most affected by
work related stress within the National Health Service (NHS). The current
study aimed to identify characteristics of NHS nursing tasks creating an
imbalance between work demand and the ability of nurses to cope. This was
achieved by developing a checklist questionnaire with the power to identify
such factors. Results from 135 nurses (34%) identified a variety of task
related factors contributing to stressful demand.

Introduction
It has been estimated that stress related illnesses are responsible for more absenteeism from
work than any other single cause (Rees and Cooper, 1992). The Health and Safety Executive
(HSE) treat hazards leading to stress related illnesses in the same way as hazards leading to
physical injury. Guidelines (HSE, 1995) state:
“Employer’s…have a legal duty…to ensure that health is not…at risk through excessive
and sustained levels of stress…from the way work is organised, the way people deal with each
other at…work, or from the day to day demands…on their workforce.”
The implication for UK employers is that they have a legal responsibility towards stress in
relation to The Health and Safety at Work Act, 1974 and the management of health and safety at
work regulations (HSE, 1992). Failure of employers to take this into consideration has led to
several cases of civil litigation, the most significant being Walker versus Northumberland County
Council in 1994, the first time an employee had been awarded compensation in such a case.
Current approaches to work related stress have focused on establishing the root cause so
that changes to the wider work environment can be made to reduce stress (primary
intervention). An example of this type of approach was devised by Organisation for
Promoting Understanding in Society (OPUS, 1995) for the Health Education Authority
which aimed to tackle stress at an organisational level. As well as the organisational approach
to stress other authors have postulated that an ergonomic approach may be of considerable
Ergonomic factors in stress from work demands of nurses 163

benefit in reducing work related stress (see for example, Smith and Sainfort, 1989;
Williamson, 1994). Despite the fact that these authors have suggested an ergonomic approach
to work stress, there are relatively few studies that examine the use of ergonomic methods to
respond to stress or which measure the stress related consequences of ergonomic
interventions (Williamson, op cit.).
Previous research has shown that high levels of occupational stress are experienced by all
occupational groups within the NHS. It has been suggested that nurses may be one of the
groups most affected by work stress within the NHS (Rees and Cooper, op cit.). Other studies
of nurses have reported high levels of chronic tiredness, high rates of absenteeism and
widespread job dissatisfaction (OPUS, op cit.).
The aim of this study was to develop a tool which could identify ergonomic factors
causing stress related work demands providing enough information to recommend stress
reducing changes. Ergonomic factors were defined as characteristics of a task creating an
imbalance between demand and individual coping. It was also intended that data provided by
the tool could be used to determine the correlation between work demand and work stress,
identify which groups reported the highest levels of stressful demand, and identify which
tasks were found the most stressful.
From previous research there appeared to be no ergonomic intervention strategies which had
been tested or evaluated. Smith and Sainfort (op cit.) use work tasks as one of the main elements
in their ‘balance theory of stress within an ergonomic framework’. According to this theory, if
the task requirements do not match individual capability then this increases the probability of
stress. From work such as this, it is suggested that an ergonomic approach may be of use.
It follows from this that there is a need to establish exactly what it is about the task that
causes the misfit. Of particular concern are factors related to the practicability and
endurability of work which are unsuitable in that they may contribute to work which the
individual finds unacceptable and unsatisfying, increasing the probability of stress. Once it
has been established which tasks cause excess demands and which factors are responsible, it
should be possible to make direct changes to the nature of the task in order to reduce stress.
An ergonomic intervention strategy which could be used to deal with work related stress was
developed for this study from the reviewed literature (see Figure 1).

Approach
The current study was influenced by stage two of the control cycle for the management of
stress. This involves analysing the possibly stressful situation, identifying psychosocial
hazards or demands, and diagnosing the harm being caused. The measurement of stress
should be based primarily on ‘self report’ measures of how people perceive their work and
the experience of stress (see Cox, op cit.).
The current approach was to identify excessive demands caused by work tasks and
determine if these task demands led to symptoms of stress as well as assessing the
contribution of ergonomic factors.
A checklist questionnaire with rating scale questions using the Likert scaling method was
developed in four main stages. The first stage involved the identification of typical work tasks
performed on a daily basis by the majority of the workforce under survey. This was done by
conducting interviews with a random sample of nurses of varying grades on different wards.
164 DW Jamieson and RJ Graves

Figure 1 Ergonomic intervention strategy to deal with work related stress

Nurses were asked to provide a breakdown of their daily duties and, from this, appropriate
task descriptors were selected which were deemed suitable by nurses. The ten task
descriptors included: making beds, washing dependent patients, lifting patients and
paperwork.
Stage 2 of the development of the questionnaire was the identification of demands
potentially arising from work tasks. Cox (op cit.) states that the nature of work related stress
hazards or demands can be classed as psychosocial and refer to job and organisational
demands which have been identified as causing stress and/or poor health. From the nine
categories identified by Cox, four demands were taken from the task design category (lack of
task variety, meaningless work, under use of skills, lack of feedback) and four from the
workload/work pace category (work too hard, too much work, lack of control over pacing,
high levels of time pressure).
Stage 3 was the identification of stress symptoms potentially arising from work demands.
These were selected from the General Health Questionnaire (GHQ) developed by Goldberg
(1972) which was designed to detect current diagnosable psychiatric disorders. Potential
cases are identified on the basis of checking twelve or more of the sixty symptoms. As the
Ergonomic factors in stress from work demands of nurses 165

incidence of symptoms reported by a subject increases so does psychological disturbance and


the probability of being a psychiatric case.
For the purpose of this study the GHQ 28 was considered which contains symptoms
selected via factor analysis. The symptoms are categorised as anxiety and insomnia, somatic
symptoms, social dysfunction and severe depression providing disturbance scores for each.
These categories have been consistently highlighted as symptoms of stress (see Cox, op cit.)
and so the symptoms within them could be used to assess psychological stress, as opposed to
general health.
The twelve symptoms selected from the GHQ 28 included: not satisfied with carrying out
a task (social dysfunction); feeling run down (somatic symptoms); lost sleep over worry
(anxiety and insomnia); thinking of yourself as worthless (depression). More consideration of
social dysfunction was relevant since stress was defined as an imbalance between demands
and coping resulting in people behaving dysfunctionally at work.
Stage 4 involved incorporating a section into the appropriate questions allowing
respondents to list causative factors if they found a task ‘very’ or ‘excessively’ demanding.
The Likert scaling method used in the checklist questionnaire produced three different types
of score. The first was the specific task demand score which subjects awarded to each of the
ten tasks in terms of the eight demands considered. The higher the score, the more demanding
subjects found the task.
The second type of score was the overall work demand score which was the sum of the
specific task demand scores. The higher the overall work demand score, the more demanding
subjects found their work in terms of the demands of the 10 tasks considered.
The third type of score was the overall work stress score which was how many of the 12
stress symptoms subjects reported they had experienced as a result of work demands. The
higher the overall work stress score, the more severe the level of psychological disturbance,
as predicted by the GHQ.
After a pilot study of the checklist questionnaire (n=78), and subsequent changes, the final
version was sent, individually addressed, to 321 nursing staff on medical and surgical wards
within a Hospital NHS Trust. Questionnaires from the pilot and main study were combined
for the main analysis (n=399).

Results and Discussion


Questionnaires were returned by 34% (n=135) of the sample. These were fairly representative
of medical and surgical wards, grade, age, and level of nursing experience. In addition, one
third of respondents reported no demand and slightly more reported no stress which indicates
there was not a strong response bias.
It was found that there was a significantly strong correlation between overall work
demand scores and overall work stress scores (Spearman 0.58, p=0.00005). This suggests
that something about the nature of the tasks was contributing to demand and so increases the
validity of searching for ergonomic factors which might be responsible.
It was also found that ‘D’ grades aged 20 to 29 working full time (n=32) reported
significantly higher overall work demand scores (p=0.023) and overall work stress scores
(p=0.0002) compared with the rest of the sample and were classed as high risk. The tasks
reported as being the most demanding by the high risk group can be viewed as the tasks
166 DW Jamieson and RJ Graves

causing the highest levels of work stress. Tasks were ordered according to the sum of specific
task demand scores reported by each respondent in the high risk group, which gave an overall
demand score for each task. The higher this score, the greater the probability of an individual
performing that task experiencing stress. The most demanding task reported by the high risk
group was making beds followed by paperwork and then helping with doctor’s rounds.
Comments taken from questionnaires returned by the high risk group indicated task
related factors contributing to stressful demand which might reduce this demand if changed.
An example of these comments in relation to paperwork is shown in Table 1.

Table 1 Example of factors contributing to demand and appropriate demand


reducing changes in relation to paperwork

The current study has shown that it was possible to develop a tool which could assess
stressful demand at work, identify contributory ergonomic factors, and suggest control
measures. The results support findings from other studies in relation to the way in which
work demand causes stress (for example, Smith and Sainfort, op cit.). The study can be
viewed as a valuable addition to the limited amount of intervention strategies to deal with
work related stress. The lessons learned from this study may be of help in guiding future
research into stressful demand at work.

References
Cox, T. 1993, Stress research and stress management: putting theory to work. Health and
Safety Executive Contract Research Report Number 61
Goldberg, D.P. 1972, The detection of psychiatric illness by questionnaire, (London: Oxford
University Press)
Health and Safety Executive 1974 Health and Safety at Work Act, 1974. HMSO, London
Health and Safety Executive 1995, Stress at Work (A guide for employers), HS(G) 116, (HSE
Books, Sudbury, Suffolk)
Health and Safety Executive 1992, Management Of Health And Safety At Work Regulations
1992. HMSO, London
OPUS, 1995, for the Health Education Authority, Organisational stress in the NHS: An
intervention to allow staff to address organisational sources of work related stress
Rees, D. and Cooper, C.L. 1992, Occupational stress in health service workers in the UK,
Stress Medicine, 8, 79–90
Smith, M.J. and Sainfort, P.C. 1989, A balance theory of job design for stress reduction,
International Journal Of Industrial Ergonomics, 4, 67–79
Williamson, A.M. 1994, Managing stress in the workplace: Part II, the scientific basis for the
guide, International Journal Of Industrial Ergonomics, 14, 171–196
A RISK ASSESSMENT AND CONTROL CYCLE APPROACH
TO MANAGING WORKPLACE STRESS

Rebecca J Lancaster

Institute of Occupational Medicine


8 Roxburgh Place
Edinburgh, EH8 9SU

A Health and Safety Executive (HSE) publication proposed that the


assessment and control cycle approach, already applied to physical health
and safety risks, be adopted to manage stress at work. The Institute of
Occupational Medicine (IOM) has developed an Organisational Stress
Health Audit (OSHA) using this approach. The feasibility of this was tested
in a study commissioned by the Health Education Board for Scotland
(HEBS). The OSHA is a three tiered approach. Stage One involves the
identification of sources of stress. Stage Two investigates areas of major
concern and generates recommendations for risk reduction. Stage Three
evaluates the effectiveness of the recommendations made in reducing risk.
This paper presents the background to this organisational approach and a
discussion of its feasibility in managing workplace stress in an NHS Trust.

Introduction
Stress is a real problem in the workplace, often resulting in high sickness absence and staff
turnover coupled with low morale and performance. Various intervention strategies have been
suggested to combat the detrimental effects of workplace stress. Murphy (1988) emphasised
the following three levels of intervention which have since been widely accepted: (1) Primary
or organisational stressor reduction, (2) Secondary or stress management training, and (3)
Tertiary, encompassing counselling and employee assistance programmes (EAPs). Whilst
there is considerable activity at the secondary and tertiary levels, primary/organisational
reduction strategies are comparatively rare (Murphy, 1988 and DeFrank & Cooper, 1987). An
HSE publication (HSE, 1995), providing guidelines for employers on how to manage
workplace stress, has recognised identifying and controlling causes of stress at source as the
most appropriate. The HSE recommend the assessment and control cycle approach for
managing physical hazards in the workplace e.g. Control of Substances Hazardous to Health
(COSHH) and suggest that this same approach be adopted in controlling psychological
stressors. The Institute of Occupational Medicine (IOM) has developed an Organisational
Stress Health Audit (OSHA) for the identification and control of work-related stress. The
feasibility of this was tested in a research study commissioned by the Health Education Board
168 RJ Lancaster

for Scotland. (HEBS). This study applied the approach in three organisations: heavy industry;
telecommunications; and an NHS Trust. This paper describes its application in one of these,
namely the NHS Trust.
The OSHA is a three tiered approach, covering hazard identification, risk assessment,
review of existing control measures, recommendations for improved control and evaluation of
control. Stage One provides an organisational overview by identifying the presence or
absence of work related stressors and opportunities for risk reduction, many of which can be
implemented by the organisation without further external input. Stage Two focuses on
investigating, in more detail, areas of particular concern identified in Stage One. Stage Three
involves assessing the extent to which recommendations in Stages One and Two have been
implemented and their effectiveness in reducing organisational stress.
A database of known causes of work-related stress was compiled from the scientific
literature and this formed the background to the OSHA. Over recent years, numerous
researchers have carried out extensive studies designed to validate or indeed refute the
existence of work characteristics which impact upon employees’ mental health (eg Cooper &
Marshall 1976, Karasek & Theorell 1990, Warr 1992, Cox 1993, and Nilsson 1993). In
general there is a high level of consensus concerning those psychosocial hazards of work
which are considered to be stressful or potentially harmful (Cox, 1993). However, although
aspects relating to non-work issues, such as home/work interface, are acknowledged to some
degree, there is a general lack of information relating to other factors which have been clearly
shown to have a significant impact on mental health, eg physical hazards, industry specific
pressures and company policies. The IOM researchers developed their approach based on all
four components ie Environmental, Physical, Mental and Social, rather than just the work
content/context divide. By addressing these four components the total work sphere i.e. all
possible work-related stressors are investigated. In addition, by placing work-related stress
within this acknowledged health and safety framework, there is more likelihood of stress
being accepted and treated in conjunction with the other types of work-related hazards.
Development of the stresser database involved ascertaining traceability of the various
‘stressors’ to be included in the IOM approach. In terms of both Environmental and Physical
components the majority of health related issues are very definitely enveloped within legislation
such as the Control of Substances Hazardous to Health Regulations (1988) and the Management
of Health and Safety at Work Regulations (1992). A checklist was collated using the current
legislative documents on all work-related physical and environmental hazards. Knowledge
concerning the latter two areas, ie Mental and Social, however is not detailed in such a way and,
as such, a review of relevant research material was carried out to conclude the full range of
work-related factors which should be addressed in a comprehensive organisational stress audit.
The OSHA is centred around semi-structured interviews tailored to the specific needs of the
organisation under investigation. Representatives of all levels and functions within the
organisation are interviewed. The line of questioning follows those known causes of work-
related stress in the database. The interviews are constructed from a database of questions
relating to the following areas: Health & Safety, Organisational Structure, Communication,
Management/supervisory skills, Training, Staff Support Facilities, Policy, Sickness Absence,
Contracts/Terms of Employment, Changes/Incidents, Work Characteristics and ‘General’. The
interviews themselves are undertaken by Occupational Psychologists due both to the nature of
the study and the need to interpret and analyse the interviews as the sessions progress.
Managing workplace stress 169

Method

Preliminary information
In applying the OSHA within the NHS Trust, a certain amount of background information
was requested, including: issues pertaining to the economic and competitive climate
surrounding the organisation; organisational structure; and data on trends in possible
indicators of stress-related problems (sickness absence, staff turnover). This information was
then used to determine a profile of the organisation and presented to the IOM internal stress
team, which comprises business related staff including those experienced in finance and
personnel management as well as appropriate scientific staff such as psychologists and
occupational physicians. Issues were raised through this presentation which helped in
constructing the semi-structured interviews.

Stage One
A Steering Group was formed by the Trust to identify the representatives for interview and to
discuss how the work would be communicated throughout the Trust. All directorates were to
be included and guidance was provided on the roles and functions required for interview.
Subsequently, directorate representatives on the Steering Group identified appropriate
persons for interview. All selected interviewees were contacted by IOM auditors regarding
possible participation and provided with background material on the study. When the final list
of participants was collected, the structured interviews were tailored according to the
interviewee’s function. For example, it would be inappropriate to ask a Managing Director
about specific work tasks and an Employee Representative about strategic decisions.
Results of the interviews were then disseminated to the IOM internal team, presenting the
sources of stress identified and possible opportunities for risk reduction. All issues were
considered with regard to the potential impact both on employee health and on the organisation.
A report was then presented to the organisation from which some recommendations have been
implemented without further IOM involvement. The report also contained recommendations
for Stage Two, proposing detailed investigations of the major concerns.

Stage Two
A number of recommendations were made for further investigation and, from these, it was
agreed that the role of the Charge Nurse should be looked at, in particular their conflicting
roles as ward manager and provider of patient care. The investigation involved interviews
with focus groups and a sample of charge nurses were asked to complete a number of
published scales and a tailored questionnaire. These Stage Two investigations identified
training, information and support needs of charge nurses aimed at reducing causes of stress,
promoting health and well-being, and optimising performance.

Stage Three
Due to the time constraints of the project, evaluation of the process within the Trust was
limited to a review of Stage One via feedback questionnaires administered to interviewees
and the Steering Group Leader. The interviewee questionnaire asked participants whether or
not the issues that were important to them had been tackled and how they felt about the
interview process. The questionnaire administered to the Steering Group Leader asked more
about the organisation’s perspective of the process and the outcomes of the audit.
170 RJ Lancaster

Results

Stage One
Stage One was successful in identifying a number of sources of stress, for example: pace of
change; uncertainty about the future; and work overload, which is perhaps not surprising
given recent changes in the NHS at large. The OSHA also successfully identified a number of
sources of stress at a local level including; poor communication among certain groups of staff
and poor relations between different professions.

Stage Two
The Stage Two investigations confirmed many of the findings of Stage One, despite the fact
that different individuals were interviewed. The following stressors were identified in both
stages: poor communication; lack of feedback and formal appraisal system; lack of clarity of
Business Manager role; and lack of support from Occupational Health. In addition the
following stressors were identified in this detailed investigation: lack of allocated time to
manage; duplication of effort in administrative tasks; poor management of change; and lack
of accountability. Recommendation were made to reduce the risk associated with the
stressors identified, an example of which is illustrated below:

Stressor: lack of allocated time to manage


Recommendation: There is a need for experienced staff at ward management
level, who understand operational issues and yet have sufficient influence
within the hierarchy to influence strategic development. Consideration
should be given to a supernumerary Charge Nurse role having responsibility
for a number of wards, and clinical management devolved to Charge Nurses.

Stage Three
An average of 94% of interviewees reported that their interview addressed relevant causes of
stress in the organisation.

Discussion
The application of the OSHA in the Trust is part of an ongoing programme of work to tackle
workplace stress. There are constant changes in the NHS and it may be argued that the
individual Trust is limited in terms of what it can do. This study has demonstrated that there
are a number of possibilities for reducing workplace stress at this local level.
The approach adopts the risk-assessment, control-cycle approach in terms of hazard
identification; risk assessment; review of existing control measures; recommendations for
improving control; and evaluation of controls. The approach has proved to be effective in
meeting these steps. Although the Trust showed a willingness to implement the
recommendations, due to the limited timescale of the study, evaluation of their impact on
reducing stress at source has not been carried out. It is hoped, as part of this ongoing program
of work, that the impact of the changes will be evaluated. This evaluation is intended to
include a review of the impact of changes on; sickness absence, job satisfaction, and staff
turnover throughout the Trust, as well as reviewing the impact on the role of Charge Nurses
specifically by re-administering the standard questionnaires.
Managing workplace stress 171

It was possible to ensure an organisational, cross-functional approach whereby all levels


and functions within the Trust were represented in interview. There is the possibility of
selection bias as the directorate representatives identified people for interview. However, in
many instances, the selection was determined by job title rather than specific individual. It is
recommended, in future applications of the approach, that an organisational chart be supplied
to the external auditors (IOM Team) complete with job titles and names of post holders, so
that they can select the participants in order to eliminate any selection bias.
The success of the approach is due, in part, to its flexibility in meeting the specific needs
of the organisation. This is achieved through the development of the company profile to allow
tailoring of the semi-structured interview, coupled with in-depth information from
representatives of the organisation. These are the main advantages that the approach has over
existing audit tools which administer a ‘standard’ questionnaire to all employees.
Organisations commented on the minimal disruption caused during administration of the
approach. The commitment and enthusiasm of the company contact is crucial to the success
of the approach and this person should be selected with great care.

Conclusions
The study reported here has demonstrated the feasibility of addressing stress in the same
manner as physical hazards in the workplace and adopting a risk assessment-hazard control
approach to reducing stress at source. Using appropriately skilled staff, backed by others to
advise on the interpretation and evaluation of findings, it has been possible to identify sources
of occupational stress and to indicate avenues for risk reduction. These recommendations
have been recognised as practicable by the Trust and some have already been acted upon. The
timescales of the work precluded the inclusion of an evaluative phase to determine the
success of the outcome in terms of reduced stress at work. However, a full evaluation is
envisaged as part of this ongoing programme to tackle stress within the Trust.

References
HSC 1988, Control of Substances Hazardous to Health Regulations (HMSO, London)
Cooper, C.L., Marshall, J. 1976, Occupational sources of stress: a review of the literature
relating to coronary heart disease and mental ill health, Journal of Occupational
Psychology, 49, 11–28
Cox, T. 1993, Stress research and stress management: Putting theory to work. Health &
Safety Executive contract research report No.61/1993, (HMSO, London)
DeFrank, R.S., Cooper, C.L. 1990, Worksite stress management interventions: Their
effectiveness and conceptualisation, Journal of Managerial Psychology, 2, 4–10
Karasek, R.A., Theorell, T. 1990, Healthy Work: Stress, Productivity and the Reconstruction
of Working Life, (Basic Books, New York)
HSC 1992, Management of Health and Safety at Work Regulations, (HMSO, London)
Murphy, L.R. 1988, Workplace interventions for stress reduction and prevention. In
C.L.Cooper, R.Payne (Eds) Causes, Coping and Consequences of Stress at Work
1988, (Wiley, Chichester)
HSE 1995, Stress at Work: A guide for employers, (HSE Books, Sudbury)
Nilsson, C. 1993, New strategies for the prevention of stress at work, European Conference
on Stress at Work—A call for action: Brussels Nov 1993, Proceedings. (European
Foundation for the Improvement of Living and Working Conditions, Dublin)
Warr, P.B. 1993, Job features and excessive stress. In R.Jenkins, N.Coney. (Eds) Prevention
of Mental Ill Health at Work, (HMSO, London)
TELEWORKING
TELEWORKING: ASSESSING THE RISKS

Maire Kerrin, Kate Hone and Tom Cox

Centre for Organizational Health and Development


Department of Psychology
University of Nottingham
University Park
Nottingham
NG7 2RD

This paper discusses the risks to well-being which may be associated with
teleworking and reports a small comparative study of ‘teleworkers’ and
office workers from within the same organisation. The survey showed that in
this organisation older and female workers were most likely to choose
teleworking as an option. The survey also revealed the teleworkers used
VDUs for longer and took fewer rest pauses than office workers matched for
age and gender. However, these differences in working practices were not
associated with any differences in health outcomes. This study also
highlighted practical problems with applying current definitions of
teleworking and the paper therefore presents an alternative conception to
guide future research in this area.

Introduction
There are various indicators that teleworking is becoming a more prevalent form of work (e.g.
Huws, 1993). Furthermore, improvements in information technology and
telecommunications mean that there is now the opportunity for more and more traditionally
office-based work to be performed from an employee’s own home. This move is being
encouraged by policy makers because of the potential to create jobs in rural areas and to
reduce commuting. However, less attention has typically been paid to the effects which
teleworking might have on individual workers. Clearly it is important to assess the
psychosocial and physical impact of this new form of working and to ensure that appropriate
measures are taken to protect worker well-being.
In a recent review for the European Foundation for the Improvement of Living and
Working Conditions, Cox et al (1996) have highlighted a range of hazards (both physical and
psychosocial) which may be associated with teleworking. These include poor design of
workstations, social, organisational and physical isolation, poor relations with work
colleagues and superiors, lack of social support, poor work patterns, poor break taking habits,
poor management, lack of promotion opportunities, conflicting demands of work and home,
and lack of training. Some of these types of hazard are known to be associated with the
experience of stress and/or poor health outcomes such as musculoskeletal problems and
eyestrain (Cox, 1993). Furthermore, some of the potential hazards identified (e.g. isolation;
workplace design for some classes of self-employed teleworkers) are not adequately covered
Teleworking: Assessing the risks 175

by current E.C. legislation (Cox et al, 1996). However, despite the potential importance of
this issue, very little empirical research has assessed either the risks or the health outcomes
associated with teleworking.
There is some limited evidence that teleworking may be associated with poor health
outcomes, for instance an unpublished survey by the Institute of Professionals, Managers and
Specialists (IMPS) of 103 teleworking professionals from the civil service in the UK found a
higher incidence of health problems (such as eyestrain, backache and joint pain) than would
be expected given the type of work being performed. Similarly a UK Employment
Department survey of non-employee teleworkers carried out in 1992 found a number of
workers with cramped and potentially dangerous working conditions and a high incidence of
Work Related Upper Limb Disorders (WRULD) and other work related problems. However,
as neither of these studies employed a control group it is impossible to determine whether the
problems experienced were specifically related to teleworking or were merely indicative of
the particular organisational culture or job type involved. A comparative study of teleworkers
and non-teleworkers was carried out as part of the PATRA project (see Dooley et al, 1994).
They found no evidence of an increase in work-related health problems in teleworkers
compared to office workers (in fact office workers reported more eye-strain, headaches and
upper limb pain). However, they do not report the number of respondents studied, nor what
types of work were being performed, nor the demographics of the sample. Judging from other
published work from the PATRA project (e.g. Dooley, 1996) it would appear that their work
was based on a relatively small sample of highly heterogeneous teleworkers from over 7
different European countries, performing a range of different types of job for a range of
organisation type. Given that these variables are likely to have an important impact on
outcome measures it is unlikely that a study of this kind can tell us anything meaningful about
the distinct effect which teleworking has on individuals.
The aim of the survey reported here was to investigate the relationship between
teleworking, work hazards and health outcomes while holding other work-related variables
constant. The study compared teleworkers and office workers from within the same
organisation, performing broadly similar jobs. During the first stage of the research
questionnaires were sent to all members of sections of the UK part of the organisation where
teleworking was offered as an option to the employees. The data collected at this stage was
used to investigate any differences between those in the sample identified as teleworkers and
those identified as office workers. Teleworkers were identified as those who used
telecommunications and IT in order to perform their work at a location other than the central
office, at least 2 days per week. This criteria was based on several existing definitions of
telework (Cox et al, 1996; Huws, 1993). During the second stage of the research each
teleworker was matched to a non-teleworker from the same organisation according to age and
gender for statistical comparisons. This approach had the advantage over previous research of
controlling extraneous variables. The survey assessed a range of hazards which can be
associated with work and a range of health outcomes including general well-being and the
incidence of WRULDs. Given the paucity of existing data, and the conflicting nature of the
findings which have been reported to date, no specific predictions were made regarding the
expected results.

Method

Respondents
The survey was sent to ninety employees of a multi-national computer company who were
free to choose whether to telework or work exclusively in the central office. Fifty two
questionnaires were returned. Thirty eight respondents identified themselves as office
workers. Fourteen identified themselves as teleworkers. However, of the self identified
teleworkers, only ten could be accepted as teleworkers using the criteria of working away
176 M Kerrin, K Hone and T Cox

from the central office at least two days per week, the remaining four were excluded from
further analysis

The demographics of the sample obtained during the first stage of the research are shown in
table 1.

Table 1. Sample demographics

In the second stage of the research the ten teleworkers were matched to ten of the office
workers according to age and gender.

Questionnaire Design
The questionnaire was designed to measure both work-related hazards and negative health
outcomes. Psychosocial, physical and organisational hazards were assessed using the
Organisational Health Questionnaire (Smewing and Cox, 1995), the Work Characteristics
Checklist (Atkins, 1995) and an Ergonomics check list for assessing VDU workstation design
against the DSE regulations. Health outcomes were assessed using the General Well-Being
Questionnaire (GWBQ)(Cox et al, 1983). Work-related upper limb disorder (WRULD) symptoms
and eyestrain were assessed using the ‘Mannequin’; a pictorial representation of the human
upper body on which respondents rate the frequency and severity of pain experienced.

Results
During the first stage of the research, analysis of the demographics of the full sample (see
table 1 above) revealed that the teleworking group were significantly older than the non-
teleworking group (t(38)=2.49, p<0.05). Furthermore there appeared to be an unequal
distribution of male and female workers into the two types of work, with females being more
likely to telework than males (χ2(1,40)= 4.75; P<0.05).
Because of the substantial demographic differences between the samples of teleworkers
and non-teleworkers, the analyses reported below are based only the matched sample of
teleworkers and office workers. Note that statistical comparison of age and gender in these
groups suggested that the matching had been successful.

Physical Hazards
The teleworkers reported using their VDUs for longer (M=5.9 hrs/day) than the office
workers (M=4.6 hrs/day) and taking fewer rest breaks (M=3.3 per day) than office workers
(M=5.5 per day). Both of these differences were found to be significant (t(9)=4.78, p<0.05;
t(9)=5.39, p<0.05 respectively).
Scores on the workstation design checklist revealed that both sets of workers had
workstations which met about the same number of the required design features (mean=14.4
out of a possible 22 for office workers, 14.33 for teleworkers). However, there was a greater
range of scores for the teleworkers (7–20) compared to the office workers (12–19), indicating
a wider variability in the adequacy of workstations.

Psychosocial and Organisational Hazards


Measures of perceived job control, job demand and organisational support (using Work
Characteristics Checklist, Atkins, 1995) did not show any significant differences between the
Teleworking: Assessing the risks 177

teleworkers and non-teleworkers. There were also no differences in the two groups’
perceptions of the health of their organisation (as measured by the Organisational Health
Questionnaire).

Health Outcomes
On the WRULD Mannequin, office workers reported feeling discomfort due to their work in
a mean 3.7 areas (s.d.=2.26), teleworkers reported feeling discomfort in a mean of 2 areas
(s.d.=2.54). The mean maximum discomfort reported in any one body area was 1.8 (out of 5)
for office workers and 1.2 for teleworkers. The mean total discomfort score (no of
areas×severity) for office workers was 5.1 and for teleworkers was 2.9. None of these
differences were significant. The were no significant differences between office and
teleworkers on measures of general well-being.

Discussion
The results of the survey did not provide evidence for any difference between the teleworkers
and office workers in terms of either work related hazards or health outcomes. One exception
to this general pattern was the finding that teleworkers were spending significantly longer
working at a VDU and were taking significantly fewer rest pauses when working at the VDU
compared to matched office workers. However, this potential hazard was not associated with
any increase in negative health outcomes as measured by Mannequin and the GWBQ.
The demographics of the initial sample suggested that teleworking employees were
significantly more likely to be older and female. The findings also indicated that teleworking
was a still a less popular choice than office working. As teleworking was optional within the
organisation the results suggest that certain types of people are more likely to choose
teleworking in preference to working at an office. This highlights the need for caution when
interpreting the results of comparisons between teleworkers and non-teleworkers because the
effects may be due to differences in the sample rather than due to teleworking (for example
there tend to be gender differences on GWBQ scores which would prove a confound if more
women are found to telework). The current study tried to avoid such problems by matching
teleworkers to non-teleworkers. However, because relatively few of the respondents could be
defined as teleworkers this led to a very small final sample size.
It is interesting to note that the while the initial sample of teleworkers available in the
current study was small (N=14), the final sample used in the analysis was made even smaller
by the strict cut-off criteria which was used to define who was and was not a teleworker. In
fact some of those who defined themselves as teleworkers were actually spending a greater
proportion of their time in the central office than those who defined themselves as office
workers. Similar problems have been reported in other teleworking research (e.g. Dooley,
1996). This raises the question of whether current definitions of telework capture anything
psychologically meaningful. It could be argued that while we know so little about teleworking
and its effects on the individual, the introduction of arbitrary cut-off points to decide who to
study and who not to study is unhelpful. Such cut-off points may actually lead to a loss of
useful data about the impact of teleworking. For example it may be that some self-defined
teleworkers make frequent visits to the central office in order to cope with problems brought
about by the act of teleworking (such as isolation from work colleagues, lack of training
opportunities, etc.). It would thus be beneficial to investigate the relationship between work
characteristics, outcomes and the time spent in various locations, rather than simply using the
time variable to distinguish between groups of teleworkers and non-teleworkers. Similarly
the other main defining characteristic of telework is the use of Information Technology and
telecommunications equipment in order to perform work away from the central office
location. Researchers have argued about what kind of technology is needed before working at
home can be deemed to be “telework” (i.e. is using a telephone and laptop enough or do you
need a fully networked computer?). However, again it can be argued that while we know so
little about the impact of teleworking, arbitrary technological cut-offs in choosing who to
178 M Kerrin, K Hone and T Cox

study should be avoided in the same way as arbitrary time cut-offs. Thus it is proposed that
researchers should study how outcomes vary with the sophistication of the technology
employed rather than excluding those using particular types of technology.
A new conceptualisation of teleworking can therefore be proposed in order to assist future
research design. Based of the discussion above it is argued that teleworking can be
conceptualised according to two core dimensions: (1) proportion of working time spent away
from the traditional work environment, and (2) extent to which telecoms and IT are used for
working away from the traditional work environment. Using these core dimensions of
telework (CORDiT) it is possible to incorporate the useful aspects of previous definitions
without their drawbacks, All workers can placed along each of these two dimensions, and the
relationships between these measures and key outcome measures can be assessed. These
outcome measures would include those discussed in the current paper such as musculo-
skeletal problems, eye-strain and well-being. However, they could also include measures
which will be of interest to organisations such as productivity.
It is also necessary to note that the effects of teleworking on health and productivity outcomes
may not be direct. There are very many possible intervening variables which may mediate or
moderate the effect. These can be divided into variables which relate to the specific job (e.g.
type of work, task demands), those which relate to the organisation (e.g. promotion opportunities,
control over pacing), those which relate to the social and physical environment of work (e.g.
isolation, work station design), and those which relate to home life (work/home interface, support
from relatives). These variables will be valuable in understanding telework and how it has its
effects on worker well-being. It is also important to note that these variables are not unique to
telework (a view which seems to be implicit in much previous telework research) but apply to
all forms of work. This is where the dimensional approach to defining teleworking is important
because it allows office workers (who will score low on both core dimensions of teleworking) to
be considered within the same conceptual framework as all types of teleworker. Thus it is
proposed that future research should explore the impact of teleworking using a multidimensional
framework such as CORDiT. Other important variables (such as job type/organisational culture)
should either be controlled as in the current study, or, if enough respondents are available,
included in analysis of the results. In this way it will be possible to substantially increase our
understanding of how teleworking effects workers. Such understanding is vital if workers are to
protected from any potential ill effects of teleworking.

Acknowledgements
The authors would like to thank Lisa Long for collecting the empirical data.

References
Atkins, L. (1995) Work characteristics and upper limb disorder in clerical and academic
staff in university departments. Unpublished BA project, Department of Psychology,
University of Nottingham.
Cox, T. (1993) Stress Research and Stress Management: Putting Theory to Work. Sudbury:
HSE Books
Cox, T., Griffiths, A., and Barker, M. (1995) The Social Dimensions of Telework. Report to
the European Foundation for the Improvement of Living and Working Conditions.
Cox, T., Thirlaway, M., Gotts, G. and Cox, S. (1983) The nature and assessment of general
well being. Journal of Psychosomatic Research, 27, 353–359.
Dooley, B., Byrne, M-T., Chapman, A., Oborne, D.J., Heywood, S., Sheehy N. and Collins,
S. (1994) The teleworking experience. Contemporary Ergonomics. S.A.Robertson
(Ed.) London: Taylor and Francis.
Dooley, B. (1996) At work away from work. The Psychologist, 9, 155–158.
Huws, U. (1993) Teleworking in Britain. A report to the Employment Department.
Employment Department Research Series No 18.
EVALUATING TELEWORKING—A CASE STUDY

Samantha Campion and Anne Clarke

HUSAT Research Institute,


The Elms, Elms Grove,
Loughborough,
Leics. LE11 3RG
Telephone: +44 1509 611088
email: s.m.campion@lboro.ac.uk

The TWIN project (TeleWorking for the Impaired, Networked Centres


Evaluation) was involved in data collection from up to 41 tele workers with
disabilities at 10 pilot sites in 5 countries across Europe. This paper
describes the evaluation methodology developed within the project and the
results in terms of teleworking as an option for employment of people with
disabilities.

Introduction
The overall objective of the TWIN project was to assess the opportunities for development and
interconnection of specialised telework centres aimed at the integration of disabled people in
the labour market at a pan-european level. People with disabilities are under-represented in the
workplace (Andrich & Alimandi, 1995) and teleworking was perceived to be a method whereby
people with disabilities could be integrated into the working environment. The TWIN project
was set up to explore this possibility through studying existing examples of pilot sites that were
supporting teleworking for people with disabilities, both to determine the barriers and
opportunities that exist and to distil information on best practice.

The TWIN project involved seven partners, five each of which were either pilot sites, or were
in touch with pilot sites. HUSAT’s role was assessment and evaluation of the existing pilot
sites. Initially a literature review was carried out to identify previous work in the area, and it
became clear that whilst actual reports of teleworking and case study information are many
and varied, when it comes to how to evaluate teleworking, information is scarce. A
methodology was developed within the project and was used to assess the situation at the
pilot sites. First the methodology developed will be discussed, then a broad outline of the
results will be reported.

A Teleworking evaluation methodology


In the same way that teleworking itself imposes changes from traditional working methods on
the worker, the manager and the organisation in carrying out the process of work (Gray, et al
1993; Huws, 1993; Korte and Wynne 1996), it also imposes a requirement for a specific
evaluation methodology. This paper describes the methodology that was used within the
TWIN project.
180 S Campion and A Clarke

Objectives
The objectives of the evaluation were to define a common methodology for monitoring and
collecting data from the pilot sites in a form that would enable the output from the evaluation
to feed into the major deliverable, a document reporting the global knowledge acquired
during the project in terms of job opportunities; trans-national teleworking; efficiency and
productivity: impacts on the employer, the networked centres and the teleworkers;
rehabilitation opportunities; social impact and human factors issues.

Procedure
It was found within the TWIN project that the most useful way to structure the evaluation to
provide the necessary data was to split the issues into 3 main levels, these being individual,
organisational and wider (societal) issues. Use of this method is corroborated by other studies
of telework (Nilles, 1987; Korte et al., 1988, and Qvartrup, 1993).

Baseline evaluation

Initially the TWIN project undertook an ‘baseline’ evaluation of the pilot sites, individual
teleworkers, potential teleworkers and trainees. Two questionnaires were prepared; a site
questionnaire and an individual questionnaire. These questionnaires allowed a picture to
develop of the present situation, leading to:

• Identification and selection of different groups of users and additional stakeholders


who could take part in the evaluation.
• Identification and selection of comparison groups (controls) who could take part in the
evaluation.
• Identification of opportunities for networkingAeleworking between pilot sites.

It became clear after the initial data capture exercise that the circumstances at the pilot sites
were not uniform. Whilst all the pilot sites are involved in promoting teleworking initiatives
for the disabled, each site had found its own particular methods for doing this, and worked
within its own national and commercial arena.

Full Evaluation

Taking into account the above baseline evaluation and the wishes of the pilot sites involved in
the project, the full evaluation took the form of a number of longitudinal case studies. Given
the diversity of the 10 pilot sites, in terms of organisational structure, and access to
teleworkers, the output of the monitoring and assessment to be reported back was defined
centrally (by HUSAT), However, the method of collecting the data at each pilot site was
decided locally. This allowed the most appropriate form of data collection to be used at each
site. Where sites did not have the necessary skills, HUSAT performed a support role, and
defined more closely the tools to be used to collect the data.

A range of tools appropriate for the specific information being collected were made available
to the pilots sites, who then chose which they would implement locally. This was felt to be the
best method, given the cultural, language and structural differences that existed between sites.
The information was then formatted by the local TWIN partner, and reported to HUSAT in a
uniform and consistent manner. Information was then analysed and disseminated to all pilot
sites, again in a uniform and consistent manner.
Evaluating teleworking 181

Results of using the methodology


The evaluation methodology used within the TWIN project was very successful, in that it
identified the main barriers and opportunities that existed in trying to implement teleworking
for people with disabilities, and enabled the production of a document detailing best practice.
There were particular areas that were observed to work well and others that needed
improvement. These were identified as:

• Allowing flexibility of data collection. The TWIN partners reported that a semi-formal
method of interviewing allowed a richer picture to develop and added value to the
reporting process, allowing surrounding issues to be discussed as well.

• Repeated handling of data and translation issues. The effects of translation could have
been alleviated by the employment of professional translators, or by the use of yes/no or
tick box type answers as much as possible, to reduce the translation needs to a minimum.

• Frequency of reporting too high for levels of change occurring. The effect on teleworkers
of having to respond to the monitoring activities became gradually more negative,
resulting in lower quality and quantity of responses.

• Teleworkers interpreting that the monitoring was of them rather than of telework. In a
minority of cases, the monitoring procedure was misinterpreted as a means for the pilot
sites to monitor teleworkers own personal progress (or lack thereof through no fault of
their own). This misunderstanding led to straight non co-operation.

Results of the evaluation


Opportunities identified

Theoretically teleworking allows a new way of working where the work can come to the worker
rather than the worker having to travel to the work. In this way people with disabilities can carry
out tasks in their own time and within their individual capabilities. The other benefit that
teleworking offered was an opportunity to be assessed without reference to disability, as the
employer may well not be aware of the disability of the teleworker. Also the opportunity to
carry out work whilst remaining in an adapted environment and without the requirement to
travel, which can be difficult and consume limited individual energy resources.

Barriers identified

Whilst teleworking can be seen to offer opportunities to people with disabilities, there are also
barriers. The main barrier identified within the TWIN project that was a factor in all but one of
the participating sites was the benefits trap. An assessment of disability is often an all or nothing
assessment, which does not take account of fluctuations in the ability to perform work. People
with disabilities who take on work when they are able find their benefits stopped completely,
and may well find that when they are unwell and unable to perform work, they cannot claim
disability benefit. Therefore, given the uncertainty of their ability to perform work in the future,
and given the amount they would have to earn to make up for the loss of benefits, many people
with disabilities felt that the risks were too great for them to take up teleworking.

The situation was only found to be different in Greece, where benefits were not linked to
working or not working, but were paid regardless. However even in this situation there were
barriers, The main barrier seemed to be the ‘culture shock’ that many of the people with
182 S Campion and A Clarke

disabilities experienced. They had been used to having their time as their own, and found it
difficult to carry out work to an acceptable quality and to a set timetable. However the work
they were given was boring and repetitive, and support for the workers was limited.

In Finland the workers were well supported and had access to suitable equipment and
training. There seemed to be a real will to succeed and this is shown in the fact that after the
end of the project, the work continued, however even in this case the level of actual
teleworking employment was low.

Another barrier for many of the teleworkers was lack of equipment. The cost of adequately
equipping a teleworker was found to be beyond the means of many of the pilot sites. This is to
some extent a ‘chicken and egg’ situation, in that to carry out work as teleworkers people
need access to equipment and training, but to finance the equipment and training, there is a
need for finances, which are not available.

Another barrier was employer perceptions. Teleworking is still a new concept to many employers,
and they are wary of trying it. Combining trying to sell the concept of teleworking with trying
to persuade employers to take on people with disabilities creates a double barrier for the pilot
sites to overcome. There are prejudices against both teleworking and employing people with
disabilities that are both unfounded but also very strong. This problem was highlighted by the
fact that during the whole 18 months of the project, only one or two people managed to find
work that was not directly supplied by a pilot site. In some ways these problems were understood
before the project started, and some pilot sites attempted to act as ‘work brokers’ for the
teleworkers. However there were limited resources and little willingness from employers to
consider this form of employment as a valid and valuable form.

Discussion
Although these results seem to indicate little success for the pilot sites and for the individual
teleworker, there were opportunities above and beyond those of teleworking for the sake of
earning a living. The access to equipment and training, even when limited, allowed the
potential teleworkers to learn new technologies and skills. There was an improvement in
quality of life, and even where work was not forthcoming, whilst many potential teleworkers
found themselves frustrated, many more preferred the present situation to what had been
available previously and were satisfied.

In hindsight the project may have been over ambitious in its’ aims. The access to technology
and training, and the concomitant improvement in the quality of life of many of the people
with disabilities that took part in the project is a laudable aim in itself without the necessity of
also finding employment in 10 months. Continuing the training, enabling people with
disabilities to learn new skills and new technology may eventually naturally lead to work
opportunities in time, so long as the training and access to technology continue.

The project encountered the perennial problem of teleworking being seen as a panacea…
Teleworking is not a job, it is a method whereby a job is performed. Therefore it is necessary
to have a skill or activity, and then to deliver that skill or activity via teleworking, the absence
of a skill still means poor employability. During the project one pilot site found itself
inundated with calls from people with disabilities who wanted to work as teleworkers,
however when asked what service they intended to offer, they replied ‘teleworking’. They
wanted to attend courses on how to become teleworkers, then offer their teleworking skills to
employers. This misperception of teleworking is not uncommon, and led to resistance
amongst employers
Evaluating teleworking 183

In many cases there is a need to supply two sets of skills, the skills to perform a particular
work role and the skills to deliver that work role via teleworking. These extra teleworking
skills will be in both technological and personal areas, including an understanding of the
technology, use of modems and data transfer mechanisms, how to deal with the social and
practical issues of teleworking, self motivation and time management to name but a few.

Conclusions
The evaluation methodology used within the TWIN project, of a baseline evaluation followed
by a set of longitudinal case studies allowed a rich description of the issues involved in
teleworking for the disabled. The major opportunities and barriers were identified, and these
were found to be consistent across the pilot sites.

Although teleworking can provide a means by which people with disabilities can be
integrated into the work environment, there are many barriers to be overcome. However,
although the training, skills and equipment could eventually lead to employment, the
situation can still be termed a success for the individual in terms of improved quality of life.

The issue of whether teleworking is an enabling technology or a ghettoising of the disabled, (ie.
where they are not introduced into the mainstream of the working environment, but instead are
kept in their own homes) depends on the reasons for teleworking. It is recommended that
teleworkers spend some proportion of their time ‘in the office’ to maintain links with the rest of
the organisation. Teleworking is not a way for employers to avoid adapting the workplace to
enable access. However in extreme cases where the disability precludes these visits, working
totally from the home or a telecentre is a viable alternative to no employment.

Acknowledgements
This paper was based on work carried out by the TWIN project (TeleWorking for the Impaired,
Networked centres evaluation) funded by the CEC under programme DGXIII. The authors
wish to acknowledge all of the project partners who contributed to the success of the project.

References
Andrich, R., Alimandi, L. eds. 1995. TWIN (T1003) Guidelines for setting up teleworking
centres integrating disabled people.
Wynne, R., Cullen, K., Mercinelli, M., Andrich, R., Alimandi, L., Campion, S., Clarke, A.,
Ashby, M., Webb, L., Carter, C., Leondaridis, L., Anogianakis, G., Savtschenko, V.
1995. TWIN (T1003) Deliverable D1.T1: Technological and Socio-economic
Requirements and Opportunities.
Gray, M., Hudson, N., Gordon, G. 1993. Teleworking Explained. John Wiley & Sons.
Huws, U. 1993. Teleworking in Britain. A report to the Employment Department
Korte, W.B., Wynne, R. 1996. Telework Penetration, Potential and Practice in Europe. IOS
Press
Nilles, J.M. (1988), Managing Teleworking. final report, University of southern California,
Los Angeles
Korte, W.B., Robinson, S., and Steinle, W.J. eds. 1988, Telework: Present Situation and
Future Development of a New Form of Work Organisation, North-Holland,
Amsterdam
Qvartrup, L., 1993, ‘Flexiwork and Telework Centres in the Nordic Countries, Trend and
Perspectives,’ Paper presented at the ECTF International Seminar “Flexiwork Policy
in the European and Nordic Labour Markets”. Helsinki, May 18–19, 1993
TEAM WORKING
TEAM ORGANISATIONAL MENTAL MODELS:
AN INTEGRATIVE FRAMEWORK FOR RESEARCH.

Janice Langan-Fox, Sharon Code, and Geoffrey Edlund.

Department of Psychology
The University of Melbourne, Australia.

In recent years, researchers from a broad and disparate range of disciplines,


have explored the utility of the ‘mental model’ (Rogers, 1993), ‘collective
mind’ (Weick & Roberts, 1993), ‘cognitive maps’ (Langfield-Smith, 1992),
and ‘team mental model’ (Cannon-Bowers, Salas, & Converse, 1993). In
general, these diverse literatures present a confusing array of concepts and
meanings with little coherence or systematic research. Further, much of the
research stems from aviation psychology, where samples are homogeneous,
and topics uniquely applicable to defence scenarios (eg, cockpit behaviour)
are favoured. The present research integrates cognitive, social and aviation
psychology, by presenting a framework for analysing ‘team organisational
mental models’. This unifying model should be useful for the ergonomist
and psychologist investigating shared cognition, group dynamics, and
teamwork in organisations.

Introduction.

Teams, organisations, and mental models: Building a framework for research.


Recent reviews of the mental models literature (eg, Klimosky & Mohammed, 1994) have
highlighted the lack of consensus on what is meant by the term ‘mental model’. Besides this
problem of definition, there are empirical and theoretical problems in developing a
conceptual model for research purposes. For instance, apart from several mental model
frameworks (see Cannon-Bowers, Salas, Tannenbaum, and Volpe, 1995), there appears to be
no theory of mental models which would enable accurate predictions in different contexts.
What is needed, is an integrative framework which links the mental models literature to those
other variables of interest which have been associated with the concept, eg, situation
awareness (Endsley, 1995), team skills (Cannon-Bowers et al, 1995). The current paper
addresses these issues through the development of a methodology which casts light upon the
every day problems and issues confronting people in the workplace, and which was
developed through a series of pilot studies in a large, national, government business
enterprise. The research aimed to answer the following questions: (a) What are the shared
understandings between employees involved in problem solving teams, and how should these
Team organisational mental models 187

be measured? and (b) What team dynamics variables contribute to the development of shared
mental models? A more general aim was to assess the success of a training program designed
to facilitate employee participation (EP) in problem solving teams about workplace issues.
The training program emphasised the importance of reaching consensus and shared or
common perceptions in solving these problems.
The concept of mental models originated from Craik (1943), who argued that we
construct internal models of the environment around us which form the basis from which we
reason and predict the outcome of events before carrying out action. More recently, Rogers
(1993 pp.2) defined mental models as “internal constructions of some aspect of the external
world that can be manipulated enabling predictions and inferences to be made”. Wilson and
Rutherford (1989) argued that mental models are constructed from the background
knowledge that an individual has of a system or task, and consist of just those aspects which
are needed to solve a problem at a particular point in time. Other authors (eg, Bainbridge,
1991), more typically use the term mental model to refer to certain contents of long-term
memory. They argue that working memory is constructed from moment to moment, from the
interaction of the contents of long-term memory (including mental models) and information
from the environment. Here, the concept will mainly apply to the long-term representation of
knowledge, however it is acknowledged that the contents of a user’s mental model may be
made available to working memory.
In order to work successfully on decision making tasks, groups must perceive, encode,
store, and retrieve information in a parallel manner; the quality of the group’s output will
depend not only on the information available to individual group members, but also on the
‘shared mental model’ present in the group (overlap between individual mental models).
Shared mental models help team members to explain what is happening during task
performance, and to arrive at common explanations across members. In turn, these lead to the
development of shared expectations for task and team demands. Cannon-Bowers et al (1993)
suggest that effective team performance can only occur when team members share an
understanding of both the task and team, and the general context in which they operate. Their
comment underscores the importance of situation awareness, with the link between team
mental models and situation awareness having been proposed by several authors (eg, Smith &
Hancock, 1995).
Whilst there are large numbers of studies in aviation psychology and group psychology,
there appear to be few studies available on team mental models in organisations. As noted by
Klimosky and Mohammed (1994), most researchers have simply assumed the existence of
shared knowledge structures, and not attempted to measure them. Further, no attempt has
been made to empirically examine processes which contribute to team mental models in
organisations. Thus various literatures were reviewed in order to develop an integrated
framework for research: EP, team dynamics, and team mental models.

Team Organisational Mental Models.


EP team dynamics
Much of the interest and support for research on groups can be attributed to their prevalence
in organisations and society. Groups are widespread in modern organisations, with
autonomous work groups, labour-management committees, product development teams, and
188 J Langan-Fox, S Code, and G Edlund

executive committees representing some of the many examples of groups that now form vital
parts of organisational life. Understanding how such groups function, therefore has important
theoretical and practical implications. Researchers are also devoting more attention to how
groups manage their external relations, and adapt to changes in the environment (Argote &
McGrath, 1993).
The involvement of workers in important decisions has grown significantly in recent years
in both Britain and the U.S with approximately half of all large, unionised manufacturing
firms running ongoing EP programs. Industrial democracy is defined as “the significant
involvement of workers in the important decisions that affect their lives”, and is achieved
through employee participation (Davis and Lansbury, 1996 pp. 6). In recent years, EP has
grown significantly in both Britain and the US, with approximately half of all large unionised
manufacturing firms running ongoing EP programs.
One difficulty in analysing problem solving groups at work such as EP teams, lies in the
fact that the ‘problem’ can be ongoing, and/or the output, product, or solution, is not readily
observable or quantifiable, nor conveniently measured at one particular point in time (eg, at
the beginning of team formation). For our purpose, it was relevant and appropriate to
understand team processes rather than say, group performance (eg, output in some form)
given that the teams had been involved in solving different workplace problems for differing
periods of time, in different types of work units. Because the research on group and team
dynamics is extensive, and spans a number of different domains the team mental model
literature was researched with a view to selecting those group or team variables which were
directly relevant to team organisational mental models and to employee participation. This
included characteristics, teamwork behaviours and skills, individual differences, and
situation awareness (or team-environment interaction).

Status characteristics: The link between shared mental models and group
process
Status characteristics theory (SCT) is a cognitive theory which contends that influence in
small groups (eg, decision making groups) is caused by expectations about performance
activated by status characteristics, “…any characteristic that has differentially evaluated
states that are associated directly or indirectly with expectation states” (Berger, Fisek,
Norman, & Zelditch, 1977 pp. 35). Status characteristics may be external to the group
(diffuse), or emerge during the course of interaction (specific). Diffuse status characteristics
include gender, education, seniority, and work level. Expectations associated with a specific
status characteristic are limited to a particular task or situation (eg, expert status). In social
interactions, high status affiliates of a status characteristic are expected to possess greater
general competence than low status affiliates. High status affiliates are thus treated by others
as though they are more competent than low status affiliates. Once formed, this shared order
of performance expectations tends to become self-fulfilling, by shaping the interactant’s
propensity to offer goal related suggestions, and the likelihood that others will attend to,
positively evaluate, or accept these. Such deferential behaviours in effect treat their recipient
as a more valued contributor to the group goal, and consequently communicate situational
worthiness. In turn, status processes may indirectly affect the development of team mental
models, via their more direct effect on goal commitment, communication openness, and
teamwork.
Team organisational mental models 189

Teamwork behaviours and skills.


Based on an extensive review of the past literature on team performance, Cannon-Bowers et
al (1995) proposed that the specification of competencies for teams in the workplace should
be derived from: (1) the requisite knowledge, principles and concepts underlying the team’s
effective performance; (2) the repertoire of required skills and behaviours necessary to
perform the team task effectively; and (3) the appropriate attitudes on the part of team
members that foster effective team performance. On the basis of this extensive review,
Cannon-Bowers et al (1995) argued that teamwork can be defined in terms of the following
skill dimensions: adaptability, shared situational awareness, performance monitoring and
feedback, leadership/team management, interpersonal relations, coordination,
communication, and decision making, and that these teamwork skills are instrumental to the
development of shared mental models. Thus, it is also important to investigate the
contribution of certain individual differences.

Individual differences.
Individual difference variables which have been hypothesised to affect the development of
shared understandings in teams, include tenure, group history, proximity, goal commitment,
and perceptions of task difficulty. It should be noted that some individual difference variables
(eg, tenure) have the potential to act as status characteristics, since individual differences may
become salient when they discriminate between the actors in the situation, even when they
are not directly connected to the task. Initially irrelevant characteristics will become involved
in a social situation unless their applicability is challenged (eg, by previous information). If it
is not challenged, the members of the group, as part of the normal course of interaction, will
act as though it is relevant to the task.

Situation awareness (SA) and team-environment interaction (TEI).


For the most part, SA has been applied in aviation teams and in military decision-making
environments (eg, Endsley, 1990). Salas, Prince, Baker, and Shrestha (1995) proposed that the
construct also applies to other types of teams. Oranasu (1990) described a crew’s shared mental
model as arising out of the articulation of SA (interpreting situation cues) and metacognition
(defining the problem and devising a plan for coping with it). He argued that casting a situation
into a commonly shared frame of reference integrates information into an overall coherent
picture (a mental model). The extent to which this picture can be achieved may depend on group
process variables such as group history, and the quality of communication and trust between
group members. Some authors see SA as a characteristic of the agent, who may be said to have
good or poor SA on the basis of his/her propensity (Endsley, 1988). Others (eg, Smith & Hancock,
1995) argue that SA is not resident in the agent, but rather exists in the invariant interaction of
the agent and the environment. It is also possible to talk about team SA, “the sharing of a
common perspective among two or more individuals regarding current environmental events,
their meaning and predicted future states” (Wellens, 1989 pp.6). Thus team-environment
interaction was an important variable in the research framework.
For many years now, teams have become a feature of daily working life, with most
organisations requiring workers to be involved in shared work activities and goals. It is timely
to assess how team processes relate to constructs representing shared cognition about the
team task and its importance, and to assess the extent of influence of various group
190 J Langan-Fox, S Code, and G Edlund

differences by virtue of group, individual, and status characteristics. The framework drawn
above suggests the important variables in this endeavour.

References.
Argote, L. and McGrath, J.E. 1993, Group processes in organisations: Continuity and
change. In C.L.Cooper and I.T.Robertson (eds.) International Review of
Organisational and Industrial Psychology 1993 Volume 8, (John Wiley, Chichester)
Bainbridge, L. 1991, Mental models in cognitive skill. In A.Rutherford and Y.Rogers (ed.).
Models in the Mind, (Academic Press, New York)
Berger, J.M., Fisek, H., Norman, R.Z., and Zelditch, M. 1977, Status Characteristics and
Social Interaction: An Expectation-States Approach (Elsevier, New York)
Cannon-Bowers, J.A., Salas, E. and Converse, S. 1993,. Shared mental models in expert
team decision making. In John Castellan, Jr. (ed.) Individual and group decision
making, (Lawrence Erlbaum Associates, New Jersey), 221–246
Cannon-Bowers, J.A., Tannenbaum, S.I., Salas, E., & Volpe, C.E. 1995, Defining team
competencies and establishing training requirements. In R.Guzzo & E.Salas (eds.)
Team Effectiveness and Decision Making in Organisations, (Jossey-Bass, San
Francisco), 333–380
Craik, K. 1943, The Nature of Explanation, (Cambridge University Press, Cambridge)
Endsley, M.R. 1988, Design and evaluation for situation awareness enhancement. In
Proceedings of the Human factors Society 32nd Annual Meeting, (Human Factors and
Ergonomics Society, Santa Monica), 97–101
Endsley, M.R. 1990, Predictive utility of an objective measure of situation awareness. In
Proceedings of the Human Factors Society 34th Annual Meeting, (Human Factors and
Ergonomics Society, Santa Monica), 41–45
Endsley, M.R. 1995, Toward a theory of situation awareness in dynamic systems, Human
Factors, 37(1), 32–64
Klimoski, R., & Mohammed, S. 1994, Team mental model: Construct or metaphor? Journal
of Management, 20, 403–437
Langfield-Smith, K.M. 1992,. Exploring the need for a shared cognitive map, Journal of
Management Studies, 29, 349–368
Orasanu, J. 1990, Shared mental models and crew decision making. Paper presented at the
12th Annual Conference of the Cognitive Science Society, (Cognitive Science Society,
Cambridge).
Salas, E., Prince, C., Baker, D.P., & Shrestha, L. 1995, Situation awareness in team
performance: Implications for measurement and training, Human Factors, 37(1),
123–136
Smith, K. and Hancock, P.A. 1995, Situation awareness is adaptive, externally directed
consciousness, Human Factors, 37(1), 137–148.
Weick, K.E. and Roberts, K.H. 1993, Collective mind in organisations: Heedful interrelating
on flight decks, Administrative Science Quarterly, 38, 357–381
Wellens, A.R. 1989, Effects of communication bandwidth upon group and human machine
situational awareness and performance (Final Briefing), (Armstrong Aerospace
Medical Research Library, Ohio)
Wilson, J.R. and Rutherford, A. 1989, Mental models: Theory and application in human
factors. Human Factors. 31. 617–634.
THE IMPACT OF IT&T ON VIRTUAL TEAM WORKING
IN THE EUROPEAN AUTOMOTIVE INDUSTRY

Chris Carter and Andrew May

HUSAT Research Institute,


The Elms, Elms Grove,
Loughborough,
Leics. LE11 3RG
Telephone: +44 1509 611088
email: c.carter@lboro.ac.uk, a.j.may@lboro.ac.uk

The following paper is based on some of the results from the TEAM (Team-
based European Automotive Manufacture) project. The objective of TEAM
was to investigate how advanced IT&T (Information Technology and
Telecommunications) tools could support working between virtual team
members along the automotive supply chain, with the aim of improving product
quality and reducing the costs and time-to-market of introducing a new product
vehicle. A TEAM software demonstrator was developed, based on the results
of a user requirements exercise and a series of workplace-based user evaluations
were carried out involving manufacturer and supplier engineers. User and
technical evaluations have shown the potential for improving communication
and collaboration during the Product Introduction Process (PIP).

Introduction
The European automotive industry is in the process of globalisation (Lamming, 1994),
(Simpson, 1996) and is characterised by increased delegation of technical responsibility to
Tier 1 suppliers and the downward supply chain for the design, production and integration of
assemblies. Global market pressures dictate increasing adoption of Concurrent Engineering,
where the integration of design and manufacturing activities and maximum parallelism of
working practices is sought. This is becoming the de-facto method for product development
(Lawson and Karandikar, 1994), and to achieve this, there is a fundamental need for
technology to support distributed engineering teams (Londono, Cleetus et al., 1992).
Despite the emergence of new technology, standard day-to-day communication and
collaboration between European automotive engineers at different locations still centres
heavily on telephone calls, face-to-face meetings, faxes and the post. These methods are often
either inefficient, or not very effective at representing engineering issues and enabling
interactive, real-time problem resolution.
The TEAM project investigated how advanced IT&T tools such as audio-video conferencing,
application sharing, shared whiteboards and distributed data product libraries might enable
more effective and efficient working between geographically separated manufacturers and
suppliers along the automotive supply chain. This paper outlines the evaluation approach, results
and conclusions from the user evaluations carried out within the project.
192 C Carter and A May

TEAM Project Evaluations


A user requirements exercise undertaken at the beginning of the project identified the basic
requirements for virtual team working within the European automotive industry. A ‘best in
class’ software demonstrator was then developed to run over heterogeneous hardware
platforms, and a series of user and technical evaluations and demonstrations undertaken in the
UK, Italy, Ireland and France.
A total of approximately 40 engineers in the UK and Italy took part in the user trials. All
were potential end-users of TEAM-type technology in that they worked with colleagues in
their company or other companies at remote sites.
Initial trials centred around realistic scenarios of use involving engineers at manufacturer
and supplier sites. These employed standard human factors evaluation techniques such as
direct observation, questionnaires and structured interviews, and identified improvements and
changes to the system functionality and ease of use; these were incorporated where possible.
Evaluation criteria for the TEAM demonstrator were established, based on the efficiency,
effectiveness and satisfaction of using TEAM to support PIP activities. The PIP is defined
here as all of the stages necessary to go from product conception to volume manufacture, i.e.
concept, feasibility, packaging, detailed design, pre-production validation, and on-going
problem solving. The following evaluation criteria were used:

• Degree of actual use of TEAM for day to day virtual team working (to enable
quantification of the resources used, relating to efficiency of working).
• Effectiveness of using TEAM for virtual team working (how well collaborative
engineering activities were supported).
• Initial and ongoing support requirements for TEAM type technology when implemented
in the workplace.
• Reliability of the hardware, software and networks in a working environment.
• Perceived and realisable benefits compared with current and alternative practices.
• Satisfaction with, and motivation to use TEAM (all stakeholder perspectives).
• The impact of TEAM on the organisational and business activities.
• The extent of buy-in and commitment from stakeholders for implementation.

Final evaluations involved real working by engineers, solving live design issues on current
vehicle programmes. These final user trials, employing the above evaluation criteria, are
outlined below.
Supplier and manufacturer engineers at Rover Group, Key Plastics and TRW Steering
Systems Ltd undertook a series of collaborative sessions over a period of 4 weeks in the
workplace. The evaluation approach taken was simple and robust, non-intrusive and required
minimal effort from the users. Session proformas were completed by the engineers for each
collaborative session; these included the following: participants involved in the session, the
engineering project under discussion, aims of the session, applications used, details of both
electronic and paper-based information used, whether additional data were required or would
have been useful during the session, how successful the session was with respect to it’s aims,
the degree of overall satisfaction with the session, and improvements or changes to the
system which would have made the session more successful.
The collaborative working in the UK addressed the following design issues on new
vehicles: the design of a Power Assisted Steering reservoir; the design of steering
components; and the design of the PCB, casing and labelling design for an engine
management ECU. In Italy, work centred around: the design of a dashboard system; design of
a vehicle seat; problem solving on a door design; and work on a suspension system.
Impact of IT&T on virtual team working 193

After the final collaborative session, structured interviews were carried out with all the
participating engineers, the systems support personnel, and business managers. The
interviews addressed the requirements for a system such as TEAM, the benefits and
drawbacks, and also looked at the wider issues concerning the future implementation of
TEAM technology in the business, organisational issues and the impact on supplier-
manufacturer relationships.

TEAM Technology
A range of user trials were undertaken. Primary rate 6-Channel ISDN was used as the Wide
Area Network (WAN) to link the UK user companies; this offered the optimum price/
performance ratio and a bandwidth of up 384 Kbits/s. CISCO 4500-M routers interfaced
between the company Local Area Networks (largely based on Ethernet) and the ISDN WAN.
In Italy trials between Fiat CRF and Magneti Marelli were undertaken using 8-Channel
ISDN. Broadband trials were also carried out between Rover UK and Siemens Automotive
(France) using a mixed SuperJanet/ATM network, providing 2 Mbits/s.
The TEAM demonstrator comprised a suite of proprietary and bespoke applications. User
sites employed differing software according to their specific requirements. The basic
applications used in the UK were ‘Communique’ for audio/video/whiteboard conferencing,
‘TeamConference’ for real-time sharing of applications, ‘Annotator’ for on-line creation and
management of minutes and ‘WebProM’ to provide a common data management
environment within a distributed team. A key requirement for the TEAM system was that, as
far as possible, the demonstrator should run over heterogeneous platforms, since a wide
variety of platforms exist along supply chains.

Results

Summary of TEAM use


During the narrowband sessions, the average session length was 35 minutes (range 20–50
minutes), with sessions fairly focused on specific design issues. One engineer at each site
generally participated, with up to 2 further additional participants for some sessions. The
TEAM sessions generally took the following form: pre-preparation of an agenda by the party
calling the conference (using Annotator) and storage of this agenda using WebProM; joint
viewing and discussion of agenda using the shared whiteboard; importing of CAD images
into the whiteboard (this was done as preparation in some cases); discussion of design issues
and annotation of imported CAD images; saving of whiteboard pages as minutes.
The most useful applications were the shared whiteboard, WebProM, Annotator and the
audio tool. There was also limited use of the video for showing physical components: images
were captured and then imported into the whiteboard. Face-to-face video was not seen as
useful for engineering discussions. For the types of design activities undertaken during the
trials, application sharing of CAD packages was also requested, however where this had been
tested over 6 Channel ISDN, the performance and reliability had been poor with complex
CAD models.
TEAM sessions were characterised as being very successful in achieving the aims of the
sessions, but from the users’ point of view, only partly satisfactory. Where the aims of the
sessions were not achieved, the main reasons were the lack of availability of key people and
the need to consult specific data which had not been anticipated, and so an inability to reach
concrete decisions. The main reasons that users felt only partly satisfied with TEAM were the
poor usability (both during start-up and in use), the lack of robustness and problems due to
the non-integration of the TEAM system with company networks.
194 C Carter and A May

Impact on the PIP


TEAM was judged a substantial improvement over current PIP working practices, although
the actual benefits were slightly negated by poor quality audio, a lack of robustness and ease
of use, and a need for integration with other company systems. The overcoming of these
limitations would produce a major improvement in current collaborative engineering working
practices. The following predictions of projected time savings for different stages of the PIP
using TEAM made by participating engineers are shown below in Table 1.

Table 1. Projected time savings using TEAM, for stages of the PIP

The prerequisites for these savings and a more detailed cost-benefit analysis are given in
(May, Carter et al., 1997) and (Distler, Carpenter et al., 1997).
A major stated benefit was the ability for engineers and designers to exchange ideas more
easily and hold more effective and detailed collaborative technical discussions than is
possible over the phone, with less possibility of misunderstandings and ambiguities. It was
still felt that face-to-face meetings (around a terminal if necessary) enabled the easiest
technical discussions, but were often inefficient due to the large amount of travel involved.
For the impact on product quality, the view was that with TEAM, a similar final product
quality would be achieved, but this would be achieved earlier, as there would be more ‘right
first time design’. Also it was felt that many changes during the build phase would be
eliminated, leading to a reduction in time to volume and in the prototype phase. More detail
could be built into the prototypes allowing detail to be finalised earlier on in the PIP.

Impact on roles and the company


From an individual’s perspective, TEAM should reduce the time spent out of the office and
increase the effectiveness and efficiency of an engineer. It is likely that engineers will feel
they have more control over their jobs as they will be able to react more quickly and
effectively to design issues as they arise. Engineers stated that they did not want the
introduction of technology such as TEAM to affect greatly current roles within companies;
rather it was seen as offering tools to help facilitate the way they work. However, if widely
implemented, it is likely that TEAM will initiate some changes to company roles and the
organisation.
CAD engineers may need to become more involved in project management, and involved
earlier in the PIP, and there will be an emphasis on using information more fully throughout the
PIP. Some activities such as quoting on tenders will happen earlier. Responsibilities and authorities
of individuals may need to be redefined, as workstation-based engineers will have powerful
communication tools, but not necessarily the authority to make decisions using them.
Companies and supply chains should feel more integrated, as communication tools reduce
the isolation felt by sites which are geographically remote. Although it is likely that TEAM
would reduce the frequency of traditional face-to-face meetings between customers and
suppliers, it is seen as actually building stronger relationships between companies due to the
more interactive nature of discussions, and more frequent communications.
Impact of IT&T on virtual team working 195

Conclusions
The user companies are pursuing implementation strategies for the technology demonstrated
in TEAM. Careful planning was recognised as essential, as with any system operating within
a complex environment.
A recommendation was made that it should be introduced at the beginning of a new
vehicle programme so that appropriate systems, processes and working practices can be
developed, and the problems of legacy data minimised. Initial emphasis should be placed on
simple tools that improve the communication between customers and suppliers, before more
complex functionality is offered. Before successful implementation, the following additional
issues would need resolving:
• Agreement on appropriate IT strategy between distributed team members.
• Resolution of costs, especially for smaller companies lower down the supply chain.
• Continuing awareness building amongst end-users and top management.
• The maintenance of company security for commercially sensitive project data.
• Integration of TEAM into company networks, for interoperability and data access.
• Major improvements in the reliability and ease of use.
• The further development by IT providers of collaborative tools that work satisfactorily
over narrowband networks.

IT&T demonstrated in TEAM offers distinct improvements over some current engineering
working practices. Benefits include time savings across all stages of the PIP, easier and more
effective discussions between distributed project teams, less possibility of misunderstandings
or ambiguities and an ability to react more quickly.

Acknowledgements
This paper was based on work carried out by the TEAM (AC070) (Team-based European
Automotive Manufacture) project, funded by the CEC under the ACTS (Advanced
Communications Technologies & Services) programme DGXIII. The authors wish to
acknowledge all of the project partners who contributed to the success of the project.

References
Distler, K., Carpenter, P., Caruso, P., D’Andrea, V., Doran, C., Fontana, P., Foster, P., May,
A., McAllister, W., Pascarella, P. and Savage, R. 1997, TEAM (AC070) Deliverable
DRR016: Cost-Benefit and Impact Analysis. München, Siemens AG.
Lamming, R. 1994, A review of the relationships between vehicle manufacturers and
suppliers, DTI/SMMT Report.
Lawson, M. and Karandikar, H.M. 1994, A Survey Of Concurrent Engineering. Concurrent
Engineering-Research And Applications 2(1), 1–6.
Londono, F., Cleetus, K.J., Nichols, D.M., Iyer, S., Karandikar, H.M., Reddy, S.M., Potnis,
S.M., Massey, B., Reddy, A. and Ganti, V. 1992, Coordinating a Virtual Team. West
Virginia, CERC–TR–RN–92–005, Concurrent Engineering Research Centre, West
Virginia University.
May, A., Carter, C., Joyner, S., McAllister, W., Meftah, A., Perrot, P., Pascarella, P.,
Chodura, H., Doblies, M., Carpenter, P., Caruso, P., Doran, C., D’Andrea, V., Foster,
P., Pennington, J., Sleeman, B. and Savage, R. October 1997, TEAM (AC070)
Deliverable DRP013: Final Results of Demonstrator Evaluation. Loughborough,
HUSAT Research Institute.
Simpson, G. 1996, Components of success—or failure—for the 21st century. Society of
Motor Manufacturers and Traders Conference, ‘Driving tomorrow’s world’,
Birmingham, UK.
WORK DESIGN
THE EFFECT OF COMMUNICATION PROCESSES UPON
WORKERS AND JOB EFFICIENCY

Anne Dickens and Chris Baber

Industrial Ergonomics Group


School of Manufacturing and Mechanical Engineering
University of Birmingham
Edgbaston
Birmingham B15 2TT

This paper examines the way that communication structures affect workers
and job efficiency. This paper focuses on a company using manufacturing
cells support is based away from the shopfloor. It was found that remotely
based support led to sequential communication processes, which in turn
meant long lead-times. Interviews with staff showed that being locked into a
highly sequential and rigid system led to low levels of job satisfaction and
frustration due to an inability to change the system for the better. It was
clearly shown that there was a strong link between organisational design,
communication processes and job satisfaction.

Introduction
Previous research has shown that the majority of companies implementing manufacturing
cells, do not make changes to their organisational structure to embrace these flexible
production methods (Dickens and Baber, 1997a). This paper determines the implications that
this has on the effectiveness of communication processes. Also, the effect that these
communication processes have upon worker satisfaction and job efficiency are examined.
The following paragraphs describe the terms used in this paper.
The organisational structure of a factory consists of two main systems: the manufacturing
system and the support system. The type of manufacturing system in this study is cellular, i.e.
the system is divided into smaller, autonomous manufacturing units.
Every manufacturing cell requires assistance, and the support system provides this. Due to
the nature of the study, no measurements are taken of cell communication structures, only of
support system structures. A support system consists of up to thirteen support functions, each
of which perform specific tasks. These functions are: information technology, maintenance,
stores control, engineering, logistics, quality, design and development, human resource
management, procurement, sales, marketing, production planning and finance (Dickens and
Baber, 1997b). The results from a postal survey (Dickens and Baber, 1997a) show that the
majority of support functions are centralised, i.e. not based upon the shopfloor, but in remote
offices. Similarly, support is functionally divided so that people performing the same or
Effect of communication processes upon workers and job efficiency 199

similar tasks are based together. This leads to a sequential approach to work. Bessant (1991)
claims that flexible models of production are incompatible with older forms of organisation,
especially those stressing division of labour and rigid bureaucratic forms. This study
investigates Bessant’s claims by examining the effect of support system configuration upon
communication processes.
The company selected for the case study was representative in that it fulfilled all the
configurational traits described in the paragraph above. That is, the company had two years
operational experience of manufacturing cells, and centrally based support functions with
functional division of work. The site had 800 employees and an annual turnover of
approximately £60 million.

Methodology
Scenario measurement was selected as the foundation of the case study. This means selecting
a common factory procedure and charting its communication structure. A scenario called
‘scheduling’ was selected due to its criticality. If the scheduling process fails, the
manufacturing cell stops. Scheduling extends from the forecast of demand, to the delivery of
the parts to the cell. The study examined the scheduling of one sub-assembly for one cell. The
sub-assembly was typical in that it had to be ordered from an external supplier (outsourced),
and required additional press-shop operations before delivery to the cell. The cell in question
is representative in that it manufactures 29 varieties of one product and employs 15 full-time
operatives and a cell manager.
To measure the scenario, outline flow process charts were used (International Labour
Office, 1979) because of their flexibility. This method not only allows the communication
structure to be quantified, but allows us to make inferences regarding job efficiency. Job
efficiency for a support system is summarised using the following measures: lead-time, on-
time deliveries, and number of non-value-adding stages within a process. A preliminary
examination of the scheduling process discovered that two charts would be necessary: one
documenting the physical stages of the scheduling process; and another monitoring flow
through the Manufacturing Resource Planning (MRPII) system. Both systems operate in
conjunction with one another to complete the scheduling scenario.

Results

The Communication Structure


Each box in Figure 1 represents a single stage in the communication structure and each arrow
indicates the direction in which the sequence proceeds. The unmarked arrows show that the
process stage moves ahead unhindered. Arrows labelled ‘not OK’ show the path of the
process if a problem occurs Because the MRPII system and communication structure operate
concurrently, certain facets overlap. As a result, the grey boxes in Figure 1 depict stages
where MRPII is accessed in some way. The bold type in Figure 1 depicts departments where
tasks are carried out.
200 A Dickens and C Baber

Figure 1. Communication structure for scheduling scenario

The MRPII System Structure

Figure 2. Flow of information through MRPII system


Effect of communication processes upon workers and job efficiency 201

In Figure 2, the lines on the diagram show information being either inputted into the MRPII
system, or requested from it. The boxes represent parties who process this information in
some way.

Summary of Job Efficiency


Only 3.23% of the components involved in the scheduling scenario arrived at the cell on-time.
The following table summarises the job efficiency measures.

Table 1. Summary of Job Efficiency Measures

* The sum of the two is still one month because they operate simultaneously

Discussion
The results show that the scheduling process has become part of the critical path in terms of
manufacturing lead-time. This means that the overall time taken to complete a product can be
either lengthened or shortened by the scheduling process, and the manufacturing cell has no
control over this. This is demonstrated by the fact that 96.77% of parts are delivered late. This
is caused by inefficiencies within the communication structure, which are discussed further
below.

Support System Configuration


Due to the fact that all support functions within the communication process are functionally
based, a sequential approach to work has evolved; each function performs their own task before
passing it on to another function. This means that communication takes longer, hence contributing
to long lead-times. A sequential approach similarly leads to individual support functions having
little understanding about the tasks performed by other functions. This manifests itself negatively
in that individual functions become self-focused rather team-focused.
When interviewed, the staff felt that the system design was poor. They believed that the
sequential nature of the system meant they were trapped in a rigid process, and often blamed
when parts were late. The latter was true even when it was the process rather than the
individual which caused the delay. For example, when the manufacturing cell did not receive
their parts on time, energy was spent apportioning blame rather than solving the fundamental
problem of a poor communication structure. Staff also felt that their suggestions to improve
the process were ignored by management.
An alternative to functionally based support would be the use of multi-functional groups.
This would allow the use of verbal face-to-face communications which are more time-
effective method than paper systems.
202 A Dickens and C Baber

The Communication Structure


The complexity of the communication structure is demonstrated by the number of stages in
the scheduling process. When examining the scheduling process in conjunction with MRPII,
there are 25 steps and 12 departments in total. The communication system evolved as the
company expanded and this lack of formal design is borne out by the complex, rigid network
of links and loops. System re-design could eliminate up to 70% of these stages as non-value-
adding.

The Use of MRPII


All staff interviewed similarly distrusted the information contained within the MRPII system.
Data was incorrectly entered into the system, resulting in inaccurate outputs. In addition, data
was mistrusted because of two-way information flows (see Figure 2), involving forecasting
(pushing information into the system) and ‘backflushing’ (pushing information back through
the system once the product has been completed in order to adjust stock levels). This results
in system ‘lags’, making it difficult to judge the accuracy of information at any given time. It
also means that information is constantly changing, which makes decision-making difficult
and inaccurate for functions such as purchasing. These lags also negate the benefits of
interfacing outsourced companies with the system.
It is questionable whether MRPII should be used at all within the company. MRPII is a
rigid system primarily designed to control ‘push’ systems, and as such, is often incompatible
with flexible manufacturing systems such as cells. It is likely that a Kanban system would
simplify the scheduling process.

Conclusions
The communication structure led to long lead-times, such that the scheduling process became
part of the critical path for the manufacturing process, and was therefore capable of limiting
throughput. The flow chart results show that the flow of information and components is
overly complex and not conducive to a typical cellular manufacturing pull system, resulting
in the need for re-design.

References
Bessant, J., 1991, Managing Advanced Manufacturing Technology; The Challenge of the
Fifth Wave, (NCC Blackwell, Manchester)
Dickens, A., Baber, C. and Quick, N., 1997a, Support system configurations: design
considerations for the factory of the future. In S.A.Robertson (ed.) Contemporary
Ergonomics 1997, (Taylor and Francis, London), 510–515
Dickens, A. and Baber, C., 1997b, Distributed Support Teams in Manufacturing
Environments . In S.Procter and F.Mueller (eds.) Teamworking, (University of
Nottingham Print: Nottingham), 75–91
International Labour Office, 1979, Introduction to Work Study Third Edition, (ILO
Publications, Geneva)
A CASE STUDY OF JOB DESIGN IN A STEEL PLANT.

H.Neary and M.A.Sinclair

Department of Human Sciences


Loughborough University
Loughborough Leics., LE11–3TU

The paper outlines the design of a critical new job in a rolling mill.
Comments were requested on the project of which this case study formed a
part, and are given at the end of the paper.

Introduction
The study took place in a large privately-owned Company, said to be one of the most efficient
in the global steel industry. The plant for this study (the ‘Beam Mill’) is part of a large
complex. Slabs of cold steel are reheated and rolled into I-shaped beams. These are large,
structural beams, of the sort used for road bridges and oil platforms. Rolling mills typically
are long, linear processes, working on batch sizes of one, with three batches on the process
line and several in the reheating furnace, being prepared to go on the line. Processing is 24
hours, 7 days a week.
The Company has been carrying out incremental changes to its processes for many years.
Most of these changes have been aimed at improving the quality of the product by
introducing automation and new machinery; other changes have been aimed at reducing
labour costs and increasing efficiency. This particular study had several objectives: improve
the productivity of the plant (by increasing throughput), improve quality of the product (by
reducing dimensional variability), and improve the flexibility and agility of the plant in its
response to the marketplace (by improved layout of equipment, more integration of
automation, and better utilisation of storage areas), and by widening the product range from
this plant. The focus of attention was the ‘hot saws’ area of the plant. In this area, beams
arrive which have been rolled to the correct cross-sectional shape from the original steel
slabs, and are still very hot. The beams are cut to size, to fulfil customer orders, and go to the
Banks, until being called off for delivery to the customer.
In May, 1995, the Company began the ‘hot saws’ refurbishment with a budget of
approximately £16 million, with the intention of completing the project by 1 September 1997.
The development was costly and time consuming, involving several suppliers and numerous
individuals from a wide range of backgrounds. The project involved reshaping of the process
line; installing two in-line higher-capacity saws in place of the two old parallel saw tracks;
replacing the IT applications and integrating them more into the whole production control system;
refurbishing the control room; and reducing the manning levels by a ratio of 4:1.
The hot saws development has had its problems and setbacks; for example, there were several
serious delays with the software design. The software supplier, a sub-contractor to the saws
supplier, had had no experience of the steel industry beforehand, and in May 1997 was about
204 HT Neary and MA Sinclair

ten months behind schedule in producing the ‘final solution’ for 1 September 1997. However,
the installation of the equipment was to go ahead on schedule, without the new control system,
with the new saws commissioned at the end of August 1997. Therefore, it was necessary to
devise an ‘interim solution’, based on the current control system. In April 1997 the interim
solution was begun within the Company, involving several IT and Engineering Departments.
The solution was constrained due to the tight time scale., and only limited development was
possible. It would incorporate minimal changes to the current mill software system, yet use the
new hardware. The consequence of this was that the ‘interim solution’ would require its own
job description and training needs, different to those currently in place and different to those
anticipated for the ‘final solution’, but this problem was temporary and could be solved during
the commissioning period. Both solutions were developed concurrently, leading to uncertainty
among the stakeholders regarding the actual operation of the new system.
Training plans were being compiled by the Training Department in the Company. However,
the main supplier had not yet provided the appropriate documentation and training in relation to
the hardware. Therefore, the adequacy and timing of training was expected to lead to problems
for the workers trying to learn about the new system and their role within it, and go ‘live’ on 1
September 1997. These problems would be exacerbated because the ‘interim solution’ had not
had sufficient time in its development to have been properly user-tested.
The Case Study was carried out over the summer of 1997, concentrating on the redesign
of the Saw Controller’s job in the hot saws Control Room, dealing only with the ‘final
solution’.

Problem definition for the Case Study


The goals set for the Case Study were as follows.

• A description of the new Saw Controller’s job (i.e. the ‘final solution’).
• Knowledge requirements for the job, as a basis for a training needs assessment.
• As a subsidiary task, comment on the approach adopted in the project, as a contribution to
self-awareness and organisational learning within the Company.

Execution of the Case Study


A ‘User-centred’ approach was adopted. Stakeholders (i.e. people directly affected by any
changes) in the hot saws development were initially identified by discussion with the project
managers for the development, and interviews started in May. ‘Snowballing’ was used to expand
the initial, basic set of stakeholders. This method is known to have problems when applied to a
general population, but in a structured organisation it was deemed satisfactory. The first interview
was a pilot interview, involving the Unit Trainer from the hot saws, to give a better insight into
the roles of operators at the hot saws and to develop a set of questions that could be posed to the
saw operators. The interviews included task analysis of the existing job and what the operators
expected to change in the new development. Their personal opinions about the full situation
were also sought. The interviews were conducted on a one-to-one basis. Some problems were
experienced with shopfloor operators at this stage, for resistance-to-change reasons.
On average an interview lasted between 30 and 40 minutes. In total, including
stakeholders in management positions, 35 interviews were conducted.
Management interviews were more exploratory, and individualised compared to those for
shopfloor operators. This was because each manager had a different background and different
input into the hot saws development project.
As well as conducting interviews in the Beam Mill, visits to other rolling mills both on
and off site were arranged. These visits provided further insight into the automation to be
introduced and operators’ opinions regarding the automation, its effects on their jobs, and so
on. An advantage of the incremental improvements approach is that there are always
Job design in a steel plant 205

analogies to the current improvement somewhere fairly close, which can act as guides for the
current development.
Observational studies were also conducted at the hot saws to establish the sequence of
events, time cycles, and so on, to enable a simulation of the workloads for the Saw Controller
under normal and worst-case scenarios to be undertaken.
On the basis of findings from these different approaches, a new job description was
developed. This involved three lengthy interviews with relevant members from management
followed by a number of workshops, with the following aims;

• to incorporate the expertise of those knowledgeable about sawing operations


• to ensure that the new job description was realistic
• to engender a degree of ownership of the final description by those likely to undertake the job.

The interviews with management established the ‘official’ job description. The first
workshop involved the unit trainer from the hot saws. The workshop’s aims were to develop a
clear statement (in flowchart form) of the tasks involved in the new operation of the hot saws,
in full automation mode and in manual mode, as well as a list of critical incidents that might
occur. For the subsequent workshops, two saw controllers, as well as the hot saws unit trainer,
provided knowledge requirements for the operation of the hot saws in the final solution, using
a new technique described in another paper in this conference by Siemieniuch et al. The
participants in these workshops were recommended by management as being senior
personnel with a good understanding of the likely Saw Controller’s job for the final solution.
A knowledge tree was devised from these workshops, which was then utilised to develop the
knowledge and skills required to perform the Saw Controller’s job.

New job description for the Saw Controller


Outline of the old working context and ob.
Before September, 1997, four operators per shift were in the saws control room; a Saw
Controller and a Saw Driver for each track. On the shopfloor, there was one Section
Controller, one Rolling Utility Man, and two Saw Utility Men. Each of the tracks had one
saw blade located at a fixed point on the track. The control room is located to the side of the
tracks and the operators have direct vision of the hot saws.
The Saw Controller decided the sequence in which the customers’ orders were cut. Bars
were identified by ‘mill codes’, used to assist the tracking and processing of bars through the
mill. The Saw Controller had a number of objectives to follow when deciding this:

• to cut priority orders first,


• to limit the number of orders open at any one time (usually 3 orders per saw),
• to ensure that the steel is being cut at the optimal rate,
• to ensure that there is maximum yield achieved from each beam.

This method was the manual mode, in which the Saw Controller inputted manually into the
computer both the order and item number, obtained from the ‘saw sheets’ (shift schedules
based on customer orders). It was under the Saw Controller’s instruction that the Saw Driver
cut the steel.
There was a second method of operation called the ‘optimisation method’. A software
algorithm decided the cutting pattern, monitored by the Saw Controller. The Saw Driver’s job
does not change in the optimisation method. In theory, the Saw Controller had only to
monitor the technology and deal with unexpected situations.

Description of the new working context and the proposed job


In the new situation the Saw Driver role has disappeared, and there is only one Utility role in the
saws area. The new hot saws operation will be almost fully automated. There will be a single
206 HT Neary and MA Sinclair

saw track through which all bars will travel. There will be two saws, one fixed and one movable.
Beams will be cut in accordance with ‘load plans’ (customer orders recategorised into loads
leaving the plant) and ‘mill codes’ will be eliminated. This will change a lot of the processes
within the system. The manning level for the saws control room will be reduced from four
operators to one. The role of the operator will be to monitor the sawing process and the operational
technology.
However, it was calculated that under ‘worst case’ conditions, a single operator could not
cope. A definition of the ‘worst case’ scenario is when the load plan requires short beams of
minimal cross-section and the process had to be operated in full manual mode, (i.e. the order
of cutting had to be calculated, and then manually controlled). This would be too demanding
for one individual. It would be a necessary management decision to operate either at a higher
manning level or to stop the mill.
The job will comprise the following tasks:

• Initial preparation, and re-start after maintenance. The operator runs a ‘Virtual Bar’
past the hot saws
• Blade Changes and Beam Section Changes. The Saw Controller ensures that the whole
mill has slowed down and that all technicians have been informed, and then monitors
the change.
• Normal Rolling Duties. The Saw Controller has only a passive, monitoring function to
perform.
• Detection and Prevention of Problems/Reaction to Alarms. The Saw Controller will
use numerous computer interfaces to monitor equipment and to prevent time delays
and critical incidents by early detection of problems. This requires communicating
with other parts of the mill to maintain ‘situation awareness’.
• Manual intervention to carry out a standard task. Several different levels of manual
intervention are possible with the new automation, including full manual operation. It will
also be possible to interrupt the automation, mid cycle, to cut a test piece, etc. The system
will then be re-instated, and would re-optimise orders for the remaining runout length.
• Carry out Secondary Tasks. These are:
- General housekeeping tasks in the control room. The operator is also expected to
clean, paint and de-scale equipment on the mill.
- Assist the mechanical and electrical engineers during maintenance tasks, by
operating the equipment on request of the engineers.
- Paperwork. The operator will log any incidents, subsequent events and actions
taken, as part of good management practice. Manual proforma sheets will be
provided. Start up check sheets will also be provided.

Training and selection issues


The above tasks, defined as the Saw Controller’s job, are possible to complete in the available
time, when the hot saws equipment and software are fully operational, and given careful
selection and training of the operators involved. However, full training plans were not
available during the case study, so no comment is possible. Nevertheless, a significant change
in the job requirements and in the working environment will occur, and preparation of the
workforce is a critical issue.
It should be noted that, from a lifecycle perspective, the operators in a plant may be seen as
the “designers’ on-site representatives”, responsible for realising the cost-effectiveness of the
plant as planned by the designers, and developing more efficient ways to operate the plant
(Rasmussen and Goodstein 1986). Hence, it is necessary to ensure the operators know what to
do, but also why they are doing it; i.e. to have some insight into the design of the plant and its
behaviour. This demands a different approach to the planning of training, compared to the usual
approach.
Another issue not known to be addressed in the plans, but of concern to the workforce is that
of promotion routes associated with this new job. It is not clear how people can progress into
this job, become more skilled, achieve promotion and advance elsewhere in the Company.
Job design in a steel plant 207

Uncertainty about this issue is known to be a source of resistance to change; furthermore, in an


environment in which downsizing is an established long-term trend, those who retain their jobs
but without seeing a recognisable future may see themselves as next-in-line and therefore have
no better motivation than those who know their jobs will cease to exist.
Finally, given the potential for boredom inherent in this new job, there is an important
issue in maintaining situation awareness and the sharpness of problem-solving skills. It was
recommended that issues of retraining and the simulation of problems should be addressed.

General comments on the project as a whole


The authors were asked to make any additional comments about the project thought to be
relevant. These are summarised below.

• It is well-known that a task in which operators are required to monitor a process for
extended periods of time with little involvement will produce wandering of attention,
lassitude, and a loss of situation awareness. The scope of this project has not permitted
coverage of this issue, but it is one for management to consider, given the significant
role of the Saw Controller in the whole process.
• Many of the tasks in the ‘final solution’ will be novel for the operators, requiring new
skills or higher levels of expertise in existing skills. Training plans will require
considerable attention if a trouble-free transition is to occur.
• Given a highly-automated process, it is known that operators can become deskilled in
fixing problems. Operators in an adjacent plant are aware of this in their own jobs.
Simulation facilities may be an answer, for operators both to learn and to maintain
their operational and problem-solving skills. Current training plans do not envision
such facilities, neither in training nor under operational conditions.
• There may be problems with skills progression and promotions surrounding this
new job.
• Some avoidable problems occurred during the implementation of the new process
during this project which seem to be due to poor communications. This is a major
issue, and refers to communications between operators and designers; between
management personnel to avoid ‘over-the-wall’ problems; and between the various
organisations involved.
• It appears that the new ways of working will mean changed priorities for jobs within
the Beam Mill. The Saw Controller now moves to centre stage in determining what
happens in the Mill. There will inevitably be knock-on effects for other jobs in the
Mill, which will need further consideration.
• The overall design philosophy seems technocentric, when current thinking is moving
towards a more socio-technical philosophy for engineering design. This shift is
perhaps best expressed as moving from ‘what technology is needed to solve this
problem’ to ‘what’s the right operational combination of people and technology to
meet business objectives?’ There seems to be a case for the Company to move in this
direction as well. The development of this approach will require time before it reaches
the same level of process maturity as the current method, but it should obviate the need
for the oft-repeated mantra of the project, “Whatever it takes, we will deliver the
goals”. Given the number of times process improvements have been carried out in the
Company, this mantra should not be necessary; it is indicative of the need for a
different approach.

References
Rasmussen, J. and L.P.Goodstein (1986). Decision support in supervisory control. Analysis,
design and evaluation of man-machine systems, Varese, Italy, 2nd IFAC/IFIP/IFORS/
IEA Conference.
THE EFFECTS OF AGE AND HABITUAL
PHYSICAL ACTIVITY ON THE ADJUSTMENT TO
NOCTURNAL SHIFTWORK

T.Reilly, A.Coldwells, G.Atkinson and J.Waterhouse

Research Institute for Sport and Exercise Sciences


Liverpool John Moores University
Mountford Building, Byrom Street
Liverpool, L3 3AF

The purpose of this research was to determine the influences of age and
habitual physical activity levels on the adjustment to and tolerance of
nocturnal shiftwork. Participants included young (mean age 23.4 years) and
older (mean age 48.9 years) male shiftworkers who operated on a slow-
rotating and backward-rotating shift. Circadian rhythm characteristics were
determined for 5 days on each of 3 work-shifts (night, afternoon, morning).
Leisure time activity was quantified by means of a questionnaire. Younger
subjects had higher amplitudes in their circadian rhythms and adapted more
quickly. Active subjects showed a similar trend in rhythm amplitude and
adjusted more quickly to nightwork. The older subjects were better suited to
the morning shift than their younger counterparts. The observations support
a scheduling scheme which takes age (but not necessarily habitual activity
level) into account.

Introduction
An appreciable proportion of the national workforce is engaged periodically in shiftwork.
Shiftwork refers to any regularly taken employment outside the day-working window,
defined arbitrarily as the hours between 07:00 and 18:00 hours (Monk and Folkard, 1992). It
has been estimated that 10–25% of individuals in employment participate in shiftwork, with
higher rates among manual workers in Britain and among part-time workers in America
(Young, 1982; Reilly et al., 1997). About 18% of European workers are thought to engage in
nocturnal shiftwork for at least one quarter of their working time and in the USA 20 million
full-time employees are involved in shiftwork (Costa, 1997). In view of the scale of nocturnal
shiftwork use in industry, the consequences of shiftwork systems on human factors issues are
worthy of investigation.
Shiftwork causes disturbances of the normal sleep-wake cycle and circadian rhythm.
Ageing is thought to be associated with a reduced ability to tolerate circadian phase shifts
such as occurs when starting on a nocturnal work-shift. There is concern also that ageing
workers have more health-related problems than younger colleagues when the human body
clock which regulates circadian rhythms is disrupted (Waterhouse et al., 1992). Such
Effects of age and physical activity on nocturnal shiftwork 209

disruption occurs after travelling across multiple time zones or engaging in nocturnal work-
shifts.
Habitual physical activity may act as a time-signal for the body clock and so influence
circadian rhythm characteristics (Redlin and Mrosovsky, 1997). Atkinson et al. (1993)
reported higher amplitudes in circadian rhythms of physically fit subjects compared to a
group of inactive individuals studied under nychthemeral conditions. It has been suggested
that this higher amplitude is characteristic of a tolerance to circadian phase shifts, such as
occurs in adjusting to night work or following long-haul flights (Harma, 1995). There is no
agreement regarding the causality of this relationship.
For the ergonomist, these considerations are relevant to the design of work schedules best
suited to ageing employees. There are consequences also for the selection of individuals
according to tolerance of night work. Therefore, the purpose of this study was to determine
the influences of age and physical activity on the adjustment to and tolerance of a nocturnal
shiftwork regimen.

Methods
Twenty male shiftworkers from car manufacturer, police and security officers were recruited
for the study. They were divided into a young (n=9; mean age±SD= 23.4±2.3 years) and an
old (n=11; mean age 48.9±5.2 years) group. All subjects worked a slow-rotating and
backward-rotating shift i.e. night, afternoon, morning.
Subjects were also subdivided into active and inactive sub-groups based on reports in the
leisure-time Physical Activity Questionnaire (Lamb and Brodie, 1991). The physical activity
status of the active group was calculated as 55.9±10.5 units compared with 4.7±2.6 units
based on energy expenditure over a 14–day period.
Observations were made over the solar day whilst on the various shifts as outlined in
Table 1. Subjects recorded their own oral temperature measured by means of a digital clinical
thermometer (Phillips, Eindhoven). Grip strengths for right and left hands were measured
using a hand-held spring-loaded dynamometer (Takei-kiki Kogyo, Tokyo). Peak expiratory
flow was measured with a flow-meter (Airmed, London). Arousal was self-rated using a
visual-analogue scale. Measurements were made where feasible every 2 h for the 5 days on
each of the 3 work-shifts.

Table 1. Times at which measures were recorded by subjects on each shift

Rhythm characteristics were determined by means of cosinor analysis (Nelson et al.,


1979). Comparisons between groups were made using analysis of variance.
The participants also completed the Standard Shiftwork Index at the beginning of the
study (Barton et al., 1990). The index contains six sections on general biographical
information, sleep and fatigue, health and well-being, social and domestic situation, coping,
and chronotype (morningness/eveningness). The detailed observations gained using the
inventory are not included in this report.
210 T Reilly, A Coldwells, G Atkinson and J Waterhouse

Results
Overall, the younger subjects were found to have higher amplitudes in their circadian
rhythms and faster adaptations of the rhythms to nightwork. The older subjects seemed better
suited to the morning shift than their younger counterparts. This corresponded to an increased
tendency to ‘morningness’ in the older group. Those subjects with a high level of leisure time
activity possessed larger circadian rhythm amplitudes than inactive individuals but not faster
adaptations of the rhythms to nightwork.

Night shift
There was a significant time of day effect for both old and young subjects in oral temperature.
For the first night worked, the age by time of day interaction was significant for oral
temperature (P<0.05) although the times for the peak values did not differ between groups.
The difference was significant at 06:00 hours, with the older group showing a significantly
lower temperature at the end of the night shift. The interaction was also apparent on the third
shift on night work, the older group experiencing a lower (P<0.05) oral temperature earlier in
the shift at 04:00 hours. The mean oral temperature decreased progressively over each
successive night in both young and old participants.
Grip strength and subjective arousal showed similar trends to those of oral temperature.
The old group reported significantly lower values than the young group (P<0.05) at 06:00 h
on the first night and at 04:00 h on the third and fifth nights.
A significant ‘time of day’ effect was observed over each of the five nights for
peak expiratory flow. Nevertheless, the same trends were evident in the old as in the young
group.

Table 2. Acrophases for the young and old groups for the first and fifth nights on
night, morning and afternoon shifts
Effects of age and physical activity on nocturnal shiftwork 211

Morning Shift
Oral temperature demonstrated a characteristic circadian rhythm on each of the five days
(P<0.0001). The age effect and the age by time of day interaction were significant for all five
days (P<0.001).
Right grip strength of the older group increased progressively during the first morning
shift until 10:00h. Values for the young group increased between 06:00 and 08:00h and then
remained fairly stable for the remainder of the shift period up to 14:00 hours. The grip
strength of the older group decreased following the 14:00h peak but remained above the
06:00h value. The differences between the groups were smallest at 06:00h.
The older group demonstrated greater alertness scores at 06:00h compared to the young
group (P<0.05).

Afternoon Shift
Oral temperature showed significant effects of time of day and of age (P<0.0001) for each
of the five days. The time of day by age interaction was significant (P<0.005) for days 1,
3 and 4.
Time of day effects and the interaction with age were significant for all five days
(P<0.0001) for right grip strength. The interaction effect was not significant for left grip
strength (P=0.053).
Alertness demonstrated significant time of day effects on all five days (P<0.001). The age
effect was significant on days 4 and 5 and the time of day by age interaction was significant
(P<0.05) on all days except the first one on this shift schedule.

Discussion
Colquhoun and Folkard (1978) reported data for a first night of shift-work. The data indicated
that for the first night shift the normal (that is diurnally phased) circadian rhythm is
maintained. The present data for the first night shift show a similar finding, although both the
old and young subjects reported a gradual decrease in temperature throughout the night.
The differences on both the first and third nights in oral temperature between the old and
young subjects suggest that the phasing or amplitudes of the groups’ circadian rhythms is
different. The lower temperature of the older subjects at the end of the night shift (06:00h)
suggests that, as a group, the older individuals may have had more difficulty to adjust to work
at this time. This result is contrary to the findings for morning shift, where the differences
between older and younger subjects were smallest at 06:00h. When sleep is also considered,
the present findings are supported in the literature (Reilly et al., 1997). Following a night
shift, the older subjects’ circadian rhythms are usually more disturbed and the individuals are
less capable of performing work. Conversely, when subjects have slept during the night, the
differences between the older and younger subjects tend to be relatively small. The early
cessation of sleep (to commence work at 06:00h) seem to have had a large influence on young
subjects, causing a relatively poor performance. However, for the older subjects, with an
increased ‘morningness’ (Reilly et al., 1997), the 06:00h start might have had little disruption
effect on their sleep, thereby allowing a relatively good performance.
212 T Reilly, A Coldwells, G Atkinson and J Waterhouse

Despite the higher amplitudes of rhythm in the younger subjects, this group had more
difficulties in adjusting to the morning shift. This contradicts the theory that high-amplitude
rhythms predict good tolerance to phase shifts.
The reduced differences in performance between the old and young groups during the
morning shift, further suggests the potential benefits of re-scheduling work-shifts of
individuals by age. If older individuals were scheduled to work morning shifts, then their
relative performance would be high when compared to a younger group. Increased
‘morningness’ of the older individuals would explain the finding. It must be noted that
although the differences between the old and young groups were smallest in the morning, the
older subjects did not out-perform the young individuals. During the night shift, the younger
individuals performed consistently better than the older subjects. Thus utilisation of older
workers, within shift systems, is probably most effective during the morning and afternoon
shifts. Overall, the adjustment to repeated phase shifts including nocturnal work was shift-
dependent and would support a scheduling scheme which takes age (but not necessarily
habitual physical activity) into account.

Acknowledgements
This work was supported by a grant from the Health and Safety Executive.

References
Atkinson, G., Coldwells, A., Reilly, T. and Waterhouse, J. 1993, A comparison of circadian
rhythms in work performance between physically active and inactive subjects.
Ergonomics, 36, 273–281
Barton, J., Folkard, S., Smith, L.R., Spelton, E.R. and Tattersall, PA. 1990, Standard
Shiftwork Index Manual. Sheffield, MRC/ESRC Social and Applied Psychology Unit
Colquhoun, W.P. and Folkard, S. 1978, Personality differences in body temperature and
their relation to its adjustment to night work. Ergonomics, 21, 811–817
Costa, G. 1997, The problem: shiftwork. Chronobiology International, 14, 89–98
Harma, M. 1995, Sleepiness and shiftwork: individual differences. Journal of Sleep
Research, 4 Suppl. 2, 57–61
Lamb, K.R. and Brodie, D.A. 1991, Leisure time physical activity as an estimator of
physical fitness: a validation study. Journal of Clinical Epidemiology, 44, 41–52
Monk, T.H. and Folkard, S. 1992, Making Shiftwork Tolerable, (Taylor and Francis,
London)
Nelson, W., Tong, U., Lee, J. and Halberg, F. 1979, Methods for cosinor rhythmometry.
Chronobiologia, 6, 305–323
Redlin, U. and Mrosovsky, N. 1997, Exercise and human rhythms: what we know and what
we need to know. Chronobiology International, 14, 221–229
Reilly, T., Waterhouse, J. and Atkinson, G. 1997, Ageing, rhythms of physical performance
and adjustment to changes in the sleep-activity cycle. Occupational and
Environmental Medicine, 54, 812–816
Waterhouse, J., Folkard, S. and Minors, D. 1992, Shiftwork, health and safety. An overview
of the scientific literature 1978–1990, London, HMSO
Young, B.M. 1982, The shift towards shiftwork. New Society, 61, 96–97
JOB DESIGN FOR UNIVERSITY TECHNICIANS: WORK
ACTIVITY AND ALLOCATION OF FUNCTION

R.F.Harrison*, A.Dickens and C.Baber

Industrial Ergonomics Group,


School of Manufacturing & Mechanical Engineering,
University of Birmingham,
Birmingham, B15 2TT, United Kingdom(* Presenter of paper).

The following paper examines the role of technicians within a university


environment. Five technicians specialising in various disciplines were
selected to form the research sample. Over a period of five days, the
constituents of the technicians jobs were defined using a flow process chart.
The distance travelled, time taken, tasks performed and problems
encountered were noted in order to highlight any inefficiencies. Further
information regarding job satisfaction was obtained using a Job Diagnostic
Survey (JDS). The results of the flow process chart show that a significant
percentage of the total time was spent walking to task destinations and
waiting for appropriate tools and access. Also, no formal procedure for
allocating and prioritising tasks was used within the university, and
technicians performed tasks as and when they arose.

Introduction
Research into job design has traditionally centred around manufacturing and service
industries (Adair and Murray, 1994). Little research has been carried out into the role of
technicians at Universities and other higher education institutions. The role of a technician is
to provide the appropriate technical support needed by the academic staff so that they can
teach students effectively. This means that a wide range of skills are often needed by the
technician, and their job can often require a high degree of movement between various
locations.
Due to a lack of job design guidelines for technicians, their roles tend to be evolutionary
by nature, i.e. the methods employed by a technician result from his or her own experience of
working within that particular role. It is the aim of the study to identify the type of activities
technicians perform during a working day, and determine any inefficiencies that may result
from evolutionary job design. In addition, the technicians will be questioned regarding
their feelings toward their job, in order to determine whether or not job design affects
motivation.
214 RF Harrison, A Dickens and C Baber

Methodology

Sample Characteristics
Five technicians were selected as participants for the study. In order to be representative, the
sample was chosen to encompass as many as possible of the technicians job-types and
responsibilities. Although their individual job-types varied considerably, all were similar in
that they produced work on an ‘as needs’ basis for both students and academics. In addition,
all have a wide range of skills and a high degree of mobility within the university. A minimum
of five years experience was deemed necessary for all participants in the study. This ensured
that the study investigated the job itself and was not confounded by the inexperience of the
technician. The technicians’ ages ranged between 28 and 55 years.

Data Collection
Three methods of data collection were considered for the study: activity sampling, travel
diagrams and flow process charts (International Labour Office, 1979). Activity sampling was
not used due to the disruptive nature of the data collection, whilst travel diagrams were
considered too time-consuming for the technicians to complete. The flow process chart was
selected as the most suitable for the study, being simple to complete, yet comprehensive.

The flow process chart


The flow process chart as outlined by the International Labour Office (1975) was used. As the
flow process chart is normally used to investigate factory jobs, slight adaptation was required
to ensure that it accurately measured the parameters necessary for this study. The original
chart categorised work activities as the following: ‘operation’, ‘transport’, ‘delay’, or
‘storage’. It was deemed more appropriate, and easier to understand, when the descriptions in
table 1 were used to categorise the activities of the technician:

Table 1. Categorisation of Work Activities

The technicians were required to complete the flow process charts themselves, including a
brief description to accompany each activity in order to aid analysis. In addition, technicians
were asked to detail the distance they travel when they are required to move. In order to minimise
inter-participant and intra-participant variability when categorising activities, every technician
was trained for a least one hour prior to beginning a study. All technicians were also supplied
with guidelines to support training, and as a source of reference during the study.
By allowing the technicians to chart their own work, it was found that the suspicion
normally associated with work studies, whereby the presence and observations of an outsider
could easily be misinterpreted, was lessened.
Job design for university technicians 215

Job Diagnostic Survey


To measure beyond the purely physical aspects of the technicians role, another means of
measurement is employed. A Job Diagnostic Survey (JDS) was selected, as described by
Hackman and Oldham (1980). Using the JDS it is possible to examine work dimensions such
as motivation and job satisfaction.
All results for the JDS are measured on a 7 point preference scale, where 1 is low and 7 is
high. The only exception is the Motivating Potential Score (MPS) which ranges from 1 to
343. This score is derived from an average of ‘skill variety’, ‘task identity’ and ‘task
significance’ scores, which are then multiplied by ‘autonomy’ and ‘feedback from the job’.
The higher the score, the more motivated the individual.
The JDS was chosen because it is regarded as a ‘good’ diagnostic tool (Hackman and
Oldham, 1980). There is though, an absence of firm evidence about the validity and reliability
of some JDS measures, especially ‘growth need strength’ (Van der Zwaan, 1975).

Results

Flow process chart


Despite comprehensive training it was found that one technicians data was inconsistent and
as a result was discarded. The remaining data for the flow process charts is depicted below in
figure 1.

Figure 1. Average Breakdown of Activities Per Day


The graph shows that a high proportion of the technicians’ time is non-value-adding, i.e.,
non-productive. In the worst case, only 49.3% of the technicians’ time was non-value-adding,
whilst the best case was 30%.

Table 2. Average Distance Travelled in One Day (m)

The data in the table above details the distance moved by the technicians whilst
performing their tasks. On average, a university technician will walk 1166 metres per day.
216 RF Harrison, A Dickens and C Baber

Job Diagnostic Survey


Table 3. Results of Job Diagnostic Survey

The JDS shows that overall, technicians are very satisfied with their jobs and relatively
highly motivated. This is reflected by the MPS which was higher than the normative data for
professional or technical workers (Hackman and Oldham, 1980). However, low satisfaction
ratings were shown for pay, supervision and feedback.

Discussion
The results showed a high proportion of non-value-adding time due to the distances moved
and delays. From the activity description (not shown in this paper) it could be seen that these
delays encompassed: talking to students about jobs, unavailability of tools, interruptions from
staff or students, and the inability to gain access to essential areas. The JDS showed that
technicians were highly motivated individuals with a high degree of job satisfaction.
However, despite these positive results, they felt strongly about pay, supervision and
feedback. Although the results suggest that more supervisory support is needed, it is theorised
that greater gain in terms of job efficiency, will come from further examining the technicians’
job design. Possible improvements to job design based on qualitative insights gained from
carrying out the study are discussed in the following paragraphs.
The first improvement involves determining the routine jobs performed by technicians,
such as maintenance, laboratory classes and paperwork. At present, routine jobs are fitted into
Job design for university technicians 217

the technicians day whenever time is available. If set times are allocated, routine jobs can be
structured in the most efficient way, taking into account distances and tools required.
Technicians will not be available to aid students and staff during this time and can therefore
perform routine jobs with the minimum of delays.
The second type of structural improvement relates to the introduction of flexible time.
Flexible time will be allocated for requests for one-off jobs by students. Time is currently
wasted because students are unsure of which technician they need for certain problems, and
they will often approach several technicians before finding the correct one. It is suggested
that a ‘technical services’ board is displayed within the department. This would list all the
technicians, their specialities, and a means of contacting during ‘flexible time’. In addition,
the board would refer students to a World Wide Web page containing in-depth information
about each technicians’ role. By using the web pages in conjunction with the technical
services board, students could quickly ascertain which technician most suited their needs.
This would lead to a reduction in unnecessary interruptions by students.
The third improvement involves the use of request forms for students booking work with
particular technicians during their flexible time. The request form would ask for information
regarding nature of a job, alongside a means of contacting students. By having request forms,
technicians could prioritise and plan work effectively, and they would be able to refer back to
previous jobs which may help direct current ones.
The final improvement relates to lack of feedback, as indicated by the JDS. The ‘request
form’ could have an additional section in which students give feedback upon completion of a
job, about the technicians performance. To increase the level of feedback from academics and
supervisors, annual reviews of technicians should take place. The comments given by
students on ‘request forms’ could provide valuable assessment information. An annual review
would similarly allow the technician to highlight job design improvements drawn from their
experience (Hackman and Morris, 1975).
It should be noted that the above suggestions attempt to increase the efficiency of the
technician and represent a starting point, and all systems require continuous improvement to
remain effective. Similarly, due to the mobile nature of the job, it would impossible to achieve
100% productive time, but it is expected that a target figure should be approximately 90%.

Conclusions
The study raised some interesting points regarding the work of the technician. However, it
would be beneficial to undertake an inter-university study to examine whether or not these
problems exist to the same extent in other institutions.

References
Adair, C.B. and Murray, B.A. 1994, Break-Through Process Redesign (Rath and Strong,
USA)
Hackman, J.R. and Morris, C.G. 1975, Group tasks, group interaction process, and group
performance effectiveness: A Review and proposed integration. In L.Berkowitz (ed.),
Advances in experimental social psychology (Academic Press, New York), 15–34
Hackman, J.R. and Oldham, G.R. 1980, Work Redesign (Addison—Wesley, USA)
International Labour Office 1979, Introduction to Work Study (ILO Publications, Geneva)
Van der Zwaan, A.H. 1975, The Sociotechnical Systems Approach: A Critical Evaluation,
International Journal of Production Research, 13, 149–163
SYSTEM DESIGN AND
ANALYSIS
ALLOCATION OF FUNCTIONS AND MANUFACTURING JOB
DESIGN BASED ON KNOWLEDGE REQUIREMENTS.

C.E.Siemienluch, M.A.Sinclair and G.M.C.Vaughan

HUSAT Research Institute


Elms Grove
Loughborough Leics LE11–1RG

This approach addresses the design of new business processes, rather than
upgrades of cells. Oganisations are construed as configurations of
knowledge, embodied in humans and machines, utilising data to create
information, and its physical manifestation (products for sale). The problem
is to optimise this configuration of knowledge and its allocation to humans
and machines. We start from: the operating environment; a knowledge
taxonomy; and a functional description of the process. This results in the
allocation of functions, the definition of human roles, and the distribution of
management functions very early in the design process.

Introduction
The frame of reference for this paper is manufacturing industry. For simplicity, we define two
categories of problems for the allocation of functions—the major facility (e.g. a new process
line), and the cell (e.g. the reconstruction of a manufacturing cell). We address the former of
these two problems, which represents a step-change in the organisation’s operations; the
latter problem, the gradual, incremental improvement to established processes, has been
addressed by many authors (e.g. Meister and Rabideau 1965; Döring 1976; Kantowitz and
Sorkin 1987; Mital et al. 1994a; Mital et al. 1994b).
An organisation can be construed as a configuration of knowledge, embodied in humans
and machines, which utilises data to create information (e.g. the product data model), and its
physical manifestation (products for sale). The problem is to optimise this configuration of
knowledge and its allocation to humans and technical systems. In doing this, a particular goal
for this method was to enable practitioners to carry out function allocation at a very early
stage in systems design, as the textbooks tell us to do without offering useful suggestions.
Note that each allocation decision creates at least one extra interaction task (e.g. co-
ordination). Dekker and Wright (1997) have argued, and managers will agree, that materials
transformation activities are seldom the source of manufacturing inefficiencies. More often,
failings in the associated information processing, communication and co-ordination tasks are
the main source, and which cause most of the accidents. Hence, early definition of roles and
responsibilities means that engineers can be given needs for control and communication in
time for inclusion in the process design, rather than having to cobble together solutions at a
late stage in the design, when many design decisions are fixed. Three premises underlie the
methodology.

• There is a basic, generic structure of knowledge for the manufacturing domain.


Abstract models of companies exist in practice and in text (Vernadat 1996).
Allocation of functions and manufacturing job design 221

• The design process for the structure of a company is invariant over the organisation’s
hierarchy (i.e. the levels of the hierarchy do not matter) and its business processes (i.e.
any process can be designed using the same method).
• This knowledge configuration can be constructed per process, and accumulated for the
whole facility. For process groups with devolved management, this is acceptable;
however, we believe that in a facility where similar processes occur in parallel the
methodology needs elaboration to deal with cross-process issues.

The starting points for the approach, which provides the only relatively stable set of
parameters for design in the early stages, are as follows.

• the operating environment (market conditions and company policies),


• a knowledge taxonomy for manufacturing functions/processes
• a function-based description of the activities in the facility.

There are two components for the DSS tool which is the embodiment of the approach; the
positioning component and the role structuring component.
The Positioning component is a spread-sheet tool that captures the ‘framework of forces’
that act on a company arising from its internal and external environments. This results in a set of
five process characteristics, expressed as points on five separate continua. The continua are:

• Structure: from wholly project- to wholly function-based


• Control: from wholly project- to wholly function-based
• Process: from wholly sequential to wholly parallel
• People: from entirely specialist to entirely generalist skill sets
• Tools: from completely automated to completely manual tools (not used at present)

This component is described in (Brookes and Backhouse 1996), and is not considered further
here. The positioning tool is not essential, as long as the organisation has a formal statement
of policies on these continua. However, a property of the positioning tool revealed on several
occasions is its encouragement of discussion about important organisational issues that are
usually not considered, because everybody ‘knows’ what the situation is. The discussions
uncover hidden assumptions, uneven distribution of information, and errors in
understandings, and it is recommended that the positioning tool is used finitially.
The Allocation of Functions and Role Structuring component has two main modules; the
functions/knowledge data base and sets of rules which act on the database. Firstly there is the
generic functions/knowledge database which contains a set of connected, decomposable functions,
each of which is serviced by a particular set of knowledge classes and associated expertise levels.
The functions part of this database comprises a set of generic manufacturing processes, which can
be tailored by a user organisation to fit its circumstances. These functions are derived from process
charts for similar processes in a range of different manufacturing companies, and by reference to
standards (e.g. BSI 7000). Secondly there is a set of six rulesets, described below:
Meta-rules. These operate on the next subsets. Their function is to switch on or off
particular rules, depending on the output from the positioning component and the users’
choices. They translate the output from the first part of the tool into constraints for the second
part. Especially, they resolve conflicts which can occur in the output from the positioning tool
(e.g. “everyone must be omni-competent, but we must have specialists”).
Process characteristics rules. There are four types: Structure, Control, Process and
People. These adjust the relationships between the process functions, their management, and
the configuration of knowledge serving these functions. They amend a generic set of
functions to fit the particular business process being considered.
Allocation of function rules. Following (Mital, Motorwala et al. 1994a; Mital, Motorwala
et al. 1994b), we consider that there are three categories for these rules, (a) Mandatory
allocation: There are mandatory reasons for allocating a function to humans or machines; for
222 CE Siemienluch, MA Sinclair and GMC Vaughan

example, safety practice, legal requirements, or engineering limitations, (b) Balance of value:
rules based on estimates of the relative goodness of human and machine technology for
performing the intended function, (c) Knowledge and communications characteristics: This
category includes rules, which could be employed to alter the pre-set attributes of functions in
the database. These attributes are in effect pre-determined answers to the questions below in
Table 1, which the user can change to fit the company circumstances.
Humans bring to the workplace particular abilities to perceive and interpret information,
to think, and to act, in the context of a variable environment. Automated systems are unlikely
to be competent to do this for some time to come. Consequently, if any of these abilities are
required in order to perform some function, then the function must be allocated to humans.
Question classes which explore this are as shown in Table 1. They have been worded such
that a ‘No’ answer implies that the function should be carried out by humans. Note that this
does not mean that a human must perform the function unaided; merely that a human must be
in direct, real-time and online control of the function. It is possible that as the design
progresses, these decisions can be reviewed.

Table 1: Classes of questions for allocation of function rules.

Knowledge rules. These are used to identify matches between the functions in terms of the
knowledge required for each of the functions. There are 16 of these, in a sequential set. The first
rule is the most restrictive in its application conditions, and the last is the most relaxed, allowing
almost any two functions to be matched and combined. Matching is on the basis of 4 criteria (or
less, depending on the rule); adjacency of the functions; completeness of the match in knowledge
classes; completeness of the match including levels of expertise; and homo-location within a
defined process sub-section. The consequences of applying only the first rule is that a multitude
of single-function jobs are produced; if only the last is applied, a few, comprehensive jobs are
produced suitable only for polymaths. Consequently, the set is applied in sequence, with provision
for back-tracking to permit alternative groupings of functions to be achieved.
Role definition rules. Their function is to control the groupings generated by the
knowledge rules, so that at the end of the exercise a sensible set of function groupings has
been achieved, perhaps with some functions still left dangling, and no nonsensical groupings
have been produced. Note that these rules take little cognisance of workload, though they do
crudely recognise the concept of a ‘headful’ of knowledge.
Allocation of functions and manufacturing job design 223

Job design guidelines. These comprise advice to the user on what to do about the
‘danglers’, and on further editing of the role groupings developed above, to make them more
suitable to the characteristics of the process and the organisation in which the process occurs.
This extra editing will arise from the users’ knowledge of the local context, to which the tool
could not be privy. The output from this will be an agreed set of roles, comprising the
operational functions, and the process management functions.
Authority vs. empowerment rules. Up to this point, roles have been defined, only with
implicit ‘boundaries’ The only identified transactions across these boundaries are those
defined by the process; traffic in operational information, products, and the like. This last
rule-set now defines the nature of management links between the roles, so that a process can
be controlled. The rules define role boundaries, and the relationships between operational
roles, between process management roles, and between both of these. This is accomplished
by the use of ‘Paste functions’, discussed below.

Paste functions
Eight types of paste functions have been defined (so-called because they are ‘pasted’ between
functions). With the exception of the ‘Congruence’ paste function, they indicate the
boundaries between two roles. Different types of relationship between roles are shown by
different combinations of the paste functions (e.g., ‘autocratic’; ‘empowered’, or ‘peer-to-
peer’). Once inserted, these Paste Functions comprise the structure and communications for
management of the process under consideration.
Congruence. This facilitates the working of concurrent functions, and provides
notifications regarding the availability, status and timeliness of the data flows between them.
This occurs within a role.
Hand-over. This indicates the hand-over of responsibility and authority from one role to
another role (most often from a project management role to an operational role).
Targeting. This paste function provides a context for the hand-over or delegation of
responsibility and authority. Targeting is also a one-way paste function operating between
one role and another.
Co-ordination. The Co-ordination paste function establishes two-way communication
between two roles. In order to achieve this certain functions must be present within the
management part of each of the two roles linked by this paste function: these functions are:
‘plan’, ‘monitor’, ‘co-ordinate’ and ‘report’.
Integration. The Integration paste function ensures that activities within different roles are
working to the same (moving) goals and with the same parameters. This does not imply
control by one role over another. Wherever an information or data link exists between roles
then an Integration paste function will be required.
Control. This paste function means that one role has ultimate authority and responsibility
for any group of functions carried out by other role(s).
Delegation. This is a variation of the Hand-over paste function, in that it includes tightly-
constrained conditions (e.g. no variance allowed in budget, allocation of resources,
timescales, etc.), and typically occurs with the Control paste function.
Propagation. The Propagation paste function transfers information between processes
rather than along a process. It enables organisational learning.
Upon completion of the actions of these rule-sets, and of the inputs by the users during
their control of the rule-sets, there will be an output providing the functions allocated to
humans (and, by default, those allocated to technology), the grouping of these functions into
roles, a listing of the knowledge classes and levels of skill required for these roles, and the
organisational structures into which these roles fit, together with the nature of the
management communications between them.

Evaluation of this approach


This methodology has been developed in the SIMPLOFI (“Simultaneous Engineering through
People, Organisation and Function Integration”) project, funded by the Engineering and Physical
Sciences Research Council, UK. Six industrial companies took part. They ranged from ‘large’
224 CE Siemienluch, MA Sinclair and GMC Vaughan

to ‘medium-sized’; none were ‘small’. The manufacturing domains were computers, automobiles,
materials testing, railway subsystems, mining equipment, and materials handling. In each of the
companies a case-study approach was adopted, partly to develop the methodology, and partly to
demonstrate its relevance to users in their endeavours both to understand and to improve their
current processes. General responses from the users involved were:
• the technical jargon and the concepts were unfamiliar to the users, and some effort was
necessary before the methodology would be usable by managers and others.
• the exercise was thought-provoking.and in some cases had led to a redefinition of
existing problems. Some felt relief that their thinking was supported by the tool.
Summarising the strengths and weaknesses of the approach:
+ It provides a tool for the organisation to explore the staffing of future processes, or
process re-engineering ideas, where there are few tools currently available
+ It provides an opportunity to influence the important, early process design decisions
from a sociotechnical viewpoint, rather than just a technical viewpoint
+ It provides a documented audit trail of users’ decisions and the consequences of these
decisions, impartially, and in more detail than is usually the case.
+ It provides a means of clarifying and crystallising differences between users in their
visions of the future and its consequences
+ It provides worked-through alternatives for the user to consider
+ It provides a basis for early recognition of training needs and selection requirements
+ Paste Functions allow comparisons of management structures, and are the basis for an
evaluation tool with metrics for the appropriateness of management structures.
+ The approach is being explored for incorporation into Reference Architectures for
enterprise modelling, and the development of IT infrastructures and applications.
- In its current form, the methodology requires domain experts for its use.
- The methodology is still a prototype, and requires further work and validation.
- The methodology is designed to work with sparse, early design information.
Necessarily, therefore, it can only offer guidance to users, not solutions.
- The tool cannot deal with workload issues; in other words, it can tell you what kinds of
roles are necessary, but not how many of each is necessary.
- For the tool to be properly effective, it should be one among a suite of interoperable
tools for business process/enterprise modelling. This is not yet available.

References
Brookes, N.J. and C.J.Backhouse (1996). Understanding Concurrent Engineering practice: a
case-study approach, Dept of Manufacturing Engineering, Loughborough University,
LE11–3TU.
Dekker, S.A. and P.C.Wright (1997). Function allocation: a question of task transformation
not allocation. ALLFN’97—Revisiting the allocation of functions issue, Galway,
IEAPress, Louisville.
Döring, B. (1976). Analytical methods in man-machine system development. Introduction to human
engineering. K.-F.Kraiss and J.Moraal. Köln, Verlag TÜZV Rheinland GmbH: Ch. 10.
Kantowitz, B. and R.Sorkin (1987). Allocation of functions. Handbook of human factors.
G.Salvendy. New York, J.Wiley & Sons: 355–369.
Meister, D. and G.F.Rabideau (1965). Human factors evaluation in system development. New
York, John Wiley & Sons.
Mital, A., A.Motorwala, et al. (1994a). “Allocation of functions to humans and machines in a
manufacturing environment: Part 1—Guidelines for practitioners.” International
Journal of Industrial Ergonomics. 14(1 and 2):3–31.
Mital, A., A.Motorwala, et al. (1994b). “Allocation of functions to humans and machines in a
manufacturing environment: Part 2—Scientific basis (knowledge basis) for the
Guide.” International Journal of Industrial Ergonomics. 14(1 and 2):33–49.
Vernadat, F.B. (1996). Enterprise modelling and integration. London, Chapman & Hall.
THE NEED TO SPECIFY COGNITION WITHIN SYSTEM
REQUIREMENTS

Iain S MacLeod

Aerosystems International
West Hendford
Yeovil, BA20 2AL
UK

Systems Engineering separates system design into logical specification and


physical design stages. Post the logical stage, the systems design is
approached by various iterations of physical design processes. The usual
consideration of Human Factors (HF) at the logical stage of design is in the
form of human constraints on system design. This form of HF consideration
is necessary. However, human system related performance requirements
should also be considered. Therefore, HF requirements should be carefully
introduced as system functional requirements at the logical phase.
Otherwise, HF influence on design has no trace to system requirements and
is commonly performed late in the physical phase. Also, design requirements
for HF should arguably result from an early consideration of system
cognition. By cognition we refer to system goal related properties of work
concerned with system direction & control, situation analysis, system
management, supervision, knowledge application, and anticipation.

Introduction
Systems Engineering (SE) separates system design into logical specification of system
functions and performance as a basis for subsequent physical design stages (see IEEE 1220,
1994). The logical stage allows requirements capture and functional/performance
specification that is logical and implementation free. Post the logical stage, the systems
design is approached by various staged iterations of physical design processes, for example a
design stage of synthesis, where requirements are equated with proposed system
architectures and candidate technologies. Traceability threads are maintained to the initial
specification to help ensure that the initial requirements are met by the design and that design
changes are noted. Change details are recorded detailing not only the thread to the original
requirements but also change origins and the reasons for the change. As technology
introduces greater complexity to systems it is important that tenets, such as those of SE, are
applied to design processes to promote quality in design.
Unfortunately, Human Factors (HF) is poorly considered by the logical stage and is only
considered in detail by the systems engineering process during physical design. The result of
this lack of early consideration is that there is a sparcity of HF requirements on which to base
physical design. There is poor traceability of HF activities to system requirements and,
therefore, its benefits to design are hard to determine and its cost is difficult to justify to the
engineering design world. This relegates much of the HF contribution to design to ‘tidy-up’
activities around the engineered design of the system.
The usual consideration of HF at the logical stage is in the form of human constraints on
system design. For example, cabin dimensions and seats. These details are important in the
226 IS MacLeod

support of human work within the designed system. However, they add little to the
specification of the human functions within the system and of their expected performance.
This article reasons that the system requirements for cognition should be initially
approached within the early specification of the system. It is argued that by this means
operator performance can be properly considered within the design of a system. It is true that
all engineered systems are designed to meet specified performance criteria but that operator
performance is poorly considered within design processes. Moreover, only through
consideration of the system requirements on the operator can the trust of the operator in the
system be promoted.
By cognition we refer to system goal related properties of work concerned with system
direction & control, situation analysis, system management, supervision, knowledge
application, anticipation of events, and associated teamwork (MacLeod, 1996). These
properties are normally relegated solely to the domain of human expertise and are not
considered within system requirements/performance specification.
However, it will be argued that in an advanced technology system it is important to
recognise what functions of cognition are required in support of the systems model and are
include in the specification. Arguably, only through careful early specification can man and
machine functions be adequately complemented within a system. Moreover, looking to the
future, machine assistants to the human must possess cognitive functions if they are truly to
complement human performance in system control.

HF and Systems Engineering (SE)


SE places much emphasis on the capturing of system requirements and the performance
specification of these requirements. Consequently, a sub specialisation of the discipline has
developed termed Requirements Engineering. Requirements Engineering formulates the
customer’s requirements for the system, requirements that are often ambiguous and
incomplete, into a set of logical functional and performance statements that represents a total
system specification.
SE strength is that it represents a multi disciplinary approach to the engineering of
systems. Unfortunately, SE specification processes typically only cater for HF constraints on
the system, constraints defining system design boundaries. Further, HF constraints are not
generally accompanied by performance requirements and mainly consider issues related to
working space and habitability. HF constraints tend to be concerned with limits of human
capability rather than the required human contribution to the specified performance of the
system.
Current SE practices logically specify the system to be engineered. Within this
specification there may be implicit HF requirements. However, traditionally these
requirements have only been considered during physical design with relation to the Human
Machine Interface. Therefore, the majority of the needed human contribution to system
performance is met by operator expertise, this often developed separately to any design
intent. Operator expertise will always be required within a man-machine system if for no
other reason than the need to cater for uncertainties in the system operating environment
(MacLeod & Wells, 1997). This expertise is created as a product of personnel selection,
training, and experience with systems. Expertise encompasses high quality operator
performance requirements within the system, these including anticipation system control.
However, a design aim should be that associated operator expertise should never need to be
developed to largely cater for inefficiencies in the physical design of the system.
It is a truism that changes in technology change the nature of human work. For example,
as advancing technology promotes increasing levels of system automation so the human role
within a system becomes one more of supervision than direct control, more of cognitive type
activity than of manifest physical activities. Such changes should be accompanied by an
understanding of the nature of these changes. Understanding of these changes must be
The need to specify cognition within system requirements 227

accompanied by the development of new approaches and methods applicable to system


design, developments that allow high quality engineered systems to be produced that fit with
their performance requirements.
SE is a growing discipline. In the U.S.A. SE is starting to approach the issues of HF
within specification. The International Council of Systems Engineers (INCOSE) has HF as
one of its main themes at its next conference at Vancouver in 1998. In the UK, the British
Psychological Society Special Interest Group on Engineering Psychology has an active
working group examining the problems associated with the specification of cognitive
functions with relation to the design of systems.

What are System Cognitive Functions (SCFs)?


A Function is stated here to be a system property that is latent until required, has an expected
level of performance, and is appropriately evoked by engineered automation or the system
operator through the application of effort (MacLeod & Scaife, 1997).
In contrast, a Task involves effort and is a system’s planned application of its functionality
towards the satisfaction of explicit system goals. Tasks may involve one or more functions
and may solely reside within the engineered system, be unique to the work performed by the
system operator, or involved the use of functions from both. Tasks can be physical in nature,
cognitive in nature, or a combination of the two (MacLeod, 1993). Further, the term cognition
(e.g. encompassing knowing, understanding, anticipation, directing, mediation of skilled
application, and control) can apply to functions and tasks resident in either man and machine
or both (Hollnagel & Woods, 1983). With the human operator, activities and actions are
necessary to address task performance. Mediating between operator system tasks and
activities are the system operator’s pertinent Cognitive Functions. Thus:

Task >> Cognitive Function >> Activity >> System Feedback

However, for the sake of system design, system related cognitive functions differ from the
system operator’s cognitive functions. SCFs are concerned with system performance issues
and not the individual expertise and cognitive processes of an operator. Whilst operator
cognitive functions, as introduced above, are tuned to the living needs of the individual, SCFs
are functions that are required solely to allow the system to meet its designed performance.
Importantly, the specified cognitive functions should not try and represent a model of the
human system which has far too many variables to make it of practical use (Chapanis, 1996).
Rather, the SCFs should cover what the operator has to understand and activate with relation
to the work situation and its associated operating procedures for control, direction, and
management of the system. In the system they represent a complementary functionality to
both engineering functionality and operator cognitive functionality.
Therefore, SCFs are supported by operator cognitive functions but are concerned with
purely system issues such as system control and management. As such, it is possible to
engineer some SCFs. For example, some system management functions could be sensibly
automated to complement the human tasks in the management and supervision of the system.
Any SCFs considered here reside within the sphere of overall system requirements.
Moreover, the eventual performance of these functions by man or machine, arrived at during
the iterative and physical processes of design, can be represented by tasks undertaken by the
engineered system, the system operator, or both.

The Association of SCFs with Other SE Functions


SE is similar to other forms of engineering in that a system is devised ‘Top Down’ from
performance specification to a decomposition of functions into a associated hierarchy of sub
functionality. It is suggested that SCFs can be considered in a similar fashion, not as a
228 IS MacLeod

decomposition of the cognitive function, but as an association of the appropriate cognitive


function to the engineering derived function. By this method, not all engineered functions
would have an associated cognitive function, though some engineered functions might have
many. Here we have a more complete method of considering the transposition between the
evocation of engineering functions and the performance of system tasks.
By considering the cognitive functionality required to support the system engineered
functions it will not only be easier to determine the form of system tasks, and the
participation in these tasks of system components, it must be possibly to consider automation
in a new light. Because cognition is now in the system frame of consideration, the cognitive
functions that are needed to support the functions resident in technology can be made
explicit. This will mean that cognitive functions can either be automated through the new
technology, if the adopted technology allows and it is desirable, or they can be more ably
understood to assist the training of operators and design HMI that is task orientated rather
than system function orientated.

The Form of SCFs


We earlier discussed the meaning of cognition. It is now necessary to consider the particular
form and nature of SCFs. Firstly, we are complementing existing forms of functionality
within the system: engineering and operator cognitive. Secondly, that complementation will
exist with consideration to specific system task areas: currently suggested as management,
supervision, direction, control, and analysis as associated with evoked engineering
functionality. Thirdly, the existence of the SCFs may or may not be manifest to other
component parts of the system.
If the SCFs are not manifest the form may be that of system error or status checking where
other components of the system may not require the results of the associated error or status
checking tasks provided the results are within pre defined or acceptable limits. If manifest the
SCFs will have various forms depending on what parts of the system needs to be aware of the
results of the evoked SCFs. Therefore, the forms in this case should be task and context
sensitive. For example, the form of a SCF assisted warning on the safety of the system would
be different from the form of an SCF assisted advisory message.
Within the engineered system the interface of the SCFs would be a matter of engineering
physical design but would have implications on the safety criticality (or any other criticality)
of the system. In the case of interfacing with the system operator’s cognitive functions the
consideration of what operator sense or senses was most appropriate as a perception receptor
would have to be considered as would the information content of the communication. In
many cases good Human Computer Interface practice could be applied. However, with
advanced technology systems there should also be distinctions and separate considerations
applied to certain forms of intra-system communications. As examples: 1) the differences
between system performance and communication of an activity to the system performance of
inter component assistance; 2) between the passing of information between components and
the proffering of advice from one component to another.
There are of course parallel arguments that can be entered into the debate. The arguments
on the nature of consciousness (e.g. Penrose, 1997), the differences between language and
conscious thought (e.g. Hardcastle, 1995), whole/part perception (e.g. Arnheim, 1986), and
the very existence of cognitive functions as discussed in this article. However, what is argued
in this article is the capture of a system function, an SCF, which will complement other
system functions to the benefit of system performance. Such functions do not have to mirror
the processes of the brain, do not have to have sentience or be aware, but must have a high
quality ability to assess situations from gained information, this both internally and externally
to the system. They must also be capable of communicating effectively.
What will be definitely be needed, not only for any SCF based approach but to equate new
advances in technology, are the creation and adoption of new design methods. To be adopted
The need to specify cognition within system requirements 229

these design methods must complement those already existing in disciplines such as SE.
Some of these methods may already exist but are currently treated with some scepticism. An
example of argued conflicts over the validity of methods can be seen in Psychology in the
debate between the use of Qualitative versus Quantitative methods. It is submitted that
qualitative methods would be required to determine the form and nature of SCFs.

The Issues
The issues discussed in this article are all related to one issue, namely that:
Advanced technologies normally have no accompanying and accepted design
philosophies and methods for their effective and quality incorporation into current system
design processes.
It is essential that better design processes are found to allow the quality adoption of new
and advanced technologies as they emerge. If old and inappropriate methods are applied to
the adoption of new technologies within design, the result in invariably the production of
systems that fail to meet performance requirements. Further, if the designers do not
understand the implications of adopted new technologies, then the delivered system will not
only under perform as a system, its very operation is an unquantified risk.
This small article has introduced the idea of the introduction of System Cognitive
Functions (SCFs) into design as a method of improving the requirements specification of
systems, and through so doing, approach an understanding and roles of new advanced
technologies and their implications to Human Machine System design. Design processes
need to incorporate a greater understanding of the system requirements on human
performance, and the support that must be offered to human cognition, if the system is to
operate under control towards its designed performance goals.
It is suggested that the ‘old and inappropriate methods’ are endemic within HF. It is time
for HF to join and complement the other system design disciplines to assist in the
development of high quality design processes that allow the adoption of advanced
technologies. It is even possible that HF practitioners could lead the way.

References
Arnheim, R. (1986). The trouble with wholes and parts. New Ideas in Psychology, 4, 281–284.
Chapanis, A. (1996), Human Factors in Systems Engineering, Wiley, New York.
Hardcastle, V.G. (1995). Locating Consciousness. Amsterdam & Philadelphia: John Benjamins Press.
Hollnagel, E & Woods, D.D. (1983) Cognitive systems engineering: New wine in new bottles,
International journal of Man-Machine Studies, 18, pps 583–600.
IEEE P1220 (1994), Standard for the Application and Management of the SE Process, IEEE
Standards Department, Piscataway, NJ.
MacLeod, I.S. (1996), Cognitive quality in advanced crew system concepts: The training of the aircrew-
machine team, Contemporary Ergonomics 1996, Taylor & Francis.
MacLeod, I.S. and Scaife, R (1997), What is Functionality to be Allocated?, in Proceedings of
ALLFN’97, Galway, Ireland.
MacLeod, I.S. and Taylor R.M. (1993), Does Human Cognition Allow Human Factors (HF)
Certification of Advanced Aircrew Systems? Proceedings of the Workshop on Human Factors
Certification of Advanced Aviation Technologies, Toulouse, France, 19–23 July, Embry-Riddle
University Press, FLA
MacLeod, I.S. and Wells, L. (1997), Process Control in Uncertainty, in Proceedings of 6th European
Conference on Cognitive Science Approaches to Process Control, Baveno, Italy, September
Penrose, R (1997), The Large, the Small and the Human Mind, Cambridge University Press, Cambridge,
UK.
ANALYSIS OF COMPLEX COMMUNICATION TASKS

Jonas Wikman

Communication Research Unit


Department of Psychology
Umeå University, S-901 87, Umeå
Sweden

The present paper has a specific emphasis problem solving and decision making
tasks where the tools for execution and completion has distinct communicative
features. Two possible analytical strategies to such tasks are critically examined:
a laboratory-based, micro level strategy and a strategy based on ergonomics.
The aim is to compose an analytical approach that can account for task
uncertainty, variation, and dynamics in communication settings. Systems
analysis and constructs from organisational psychology and constructivistic
theory are suggested as means to defy some of the current limitations.

Introduction
Modern real life tasks are characterised by a high degree of uncertainty. The trends in
working life towards flexible working hours, flexispace, team, and project organisations,
conditioned by the proliferation of information technology, accentuate the dynamic aspects of
work. Thus, a characterisation of modern, discretionary tasks must be guided by analytical
approaches that can account for the critical dimensions mentioned. The present paper is
concerned with the analytical and methodological consequences of the evolution of work
practice and follows the discussions that have accompanied the development of task analysis.
Emphasis is on real-life tasks in naturalistic settings where the primary means for execution
and completion is interpersonal communication.

Traditional work or job analysis methods have focused on tasks with physical and procedural
characteristics, existing in stable work environments. The seemingly clear, established and
uncomplicated relation between a task and the goal, has made explicit analysis of goal hierarchies
redundant, since they have, supposedly, been embedded in practice and customs. However,
substantial changes in working life have taken place and the demands on human performance
have shifted from primarily physical to cognitive. As a consequence, the discrepancy between
reality and models of analysis has become apparent. The new complex, process-oriented tasks
that require activities such as co-ordination, planning, decision-making, and communication do
not offer correct behavioural indicators needed to specify the parameters that regulate behaviour.
Analysis of complex communication tasks 231

This underspecification leads to an inability to validate the mediating processes that determine
task performance.

Problems with task analysis


It is possible to distinguish between two evolving task analysis traditions. The theory driven,
nomothetic approach that concentrates on micro level of task analysis (e.g. Wickens, 1992) is
closely related to the dominant, basic research trend in cognitive psychology. The essence of
the classic critique of this laboratory-based, cognitive approach, put forth by Neisser (1976)
and others, concerns construct validity and can be summarised as follows: In the laboratory a
limited number of variables can be manipulated independently. However, the only thing that
guarantees this independence is the simple and stable environment that characterises the
experimental situation in which the boundaries are clear, and relevant contextual factors are
eliminated or held constant. In real life situations no controls that would warrant this
independence may exist. It is a possible scenario, that when the experimental control is lifted,
contextual and confounding variables are added that might invalidate an experimentally
confirmed independence. This independence may be the basic condition on which the
theorising rests and, in effect, the whole line of reasoning will be degraded. In addition, the
approach is based on a model of reality where closed systems are linearly and additively
combined into seemingly open ones where boundary conditions are constant. Such a model
has proved to be of little use for practitioners (Hollnagel, 1982).

An alternative, more or less atheoretic approach has been criticised for suffering from
problems that are the reverse of those of the micro level tradition. The starting point for this
ergonomics-based approach is the task together with the restrictions and/or properties of the
situation as they occur in the specific workplace. Researchers construct heuristic and
customised models that are well grounded in the situation and comprise all relevant
contextual variables. The prime ambition has been to solve the practical problem at hand and
not to develop a general methodological and theoretical base. The concepts and models
applied are validated in the situation, but transfer to other contexts is secondary (Rasmussen,
1993). Consequently, the representation of contextual factors in these models usually have
high construct validity. However, detailed analysis is often restricted to specific tasks such as
aviation and process control. In these contexts, the boundary conditions are thoroughly
explored but lack general cognitive explanations which limits external validity. In general, the
inability of ergonomists to build theory that stretches outside the immediate application area
invites critics to question the already shallow theoretical foundation.

The validation issues and the systems definition


While micro-level analysts ignore contextual dependence, this alternative ergonomic
approach is overwhelmed by it. This has implications for an analysis of problem solving tasks
with communicative features: The theoretical and practical base need to be broadened so that
the absence of valid psychological constructs within the ergonomic approach can be rectified
and thereby offer the necessary means for generalisation. The basic issue is what system is
under study. Following Cook and Campbell’s (1979) discussion on validity, it should be
noted that the time dimension does not have to be the sole parameter that affects the
development of the function between two situations, it may be the situational or the
232 J Wikman

populational dimensions, or an interaction between some combination of the three, that


produces the changes between the first and second states. The question concerns what the
difference is between the original empirical corroboration of the fact and the generalisation:
Is it the nature of the system or is it the parameters given in the definition of the system? With
an underspecified systems definition it is impossible to determine where the boundary
between the system under study and other systems is drawn.

The approach suggested in this paper is based on systems theory (Katz & Kahn, 1978). The
systems approach requires a description of boundary conditions, the interaction between the
system and its context, and the structure of the dynamic transformation process within the
system with an emphasis on regulatory feed-back loops.

Systems analysis—state of the art


From the perspective of Rasmussen (1993) and others, there seem to be little controversy in
stating that the major inadequacy in contemporary human factors applications is the lack of
explicit systems analysis. Practitioners approach the task without paying attention to basic company
goals, functions and processes which are crucial in order to identify existing options for change.

Possible theoretical specifications


The question is how an approach should be broadened in order to meet current constraints.
The suggestion is a multiple levels of analysis approach (Pfeffer, 1985), with the inclusion of
the role concept, the incorporation of communication in the analysis of organisational
processes, and the use of organisation theoretic concepts such as context, external
environment, technology, goals, and organisational structure.

The theoretical and methodological framework of organisation theory offers, through these
macroergonomic factors (Hendricks, 1995), tools that readily can be used to guide analysis.
However, the complex systems under study are not exhaustively defined in terms of principles
of organisational structure and behaviour. The study of communication behaviour at work has
received little attention in current task analysis approaches. Yet, interpersonal communication is
usually an integral part of task behaviour. It is also possible that the constructivistic school can
contribute to the analysis of complex tasks. Theoretically, they develop the idea of a cognitive
system and view the individual as a consciously reflecting organism engaged in intelligent
social action in which information is selected, transformed and enriched. The constructivists
suggest an extension of the role concept which may refine the systems approach and allow the
researcher to view social systems as consisting of roles rather than individuals.

Task-related communication—theoretical and methodological implications


To what extent are macro-organisational principles mirrored through the task related
communication at work, and to what extent can socially constructed models be inferred from
communication data? In the first case it is reasonable to expect a correlation, since the work
activities are integrated in an organisational system and the task related discourse should reflect
this system. Such communication data can, in principle, be determined relatively objectively,
since they consists of technological, administrative, and other tangible concepts. On the other
Analysis of complex communication tasks 233

hand, it may be problematic to interrelate different levels of analysis—to integrate macro and
micro perspectives. Since focus is on invariant features of behaviour, it does not account for the
fact that there are different subjective perspectives. Methodologically, this criticism is one of
the major contributions from the constructivists, i.e. the methods used to study these processes
must be sensitive to individual variation and account for subjective data.

However, in the construtivistic tradition socially constructed concepts and models are inferred
from discursive data. It is an inductive approach with distinct problems of unique, subjective
descriptions and interpretations, resulting in difficulties when generalising across cases. In an
explorative approach to study communication in complex tasks, it would be safer to anchor the
analysis in well-defined categories, that follow a preliminary task description, governed by
system concepts and organisation principles, than to rely on a constructivistic strategy.

There are several methodological implications for observation of task related communication.
For instance, what kind of research strategy should be employed when focus is on complex
tasks within a real-life context, and when the question posed is ‘how’, and the goal is to draw
valid inferences about the strategies used in task execution? What strategy should be used when
the task is characterised by great complexity, and several aspects covary naturally, and when the
boundaries between the task and its context are not clearly evident, and it is not possible to gain
control over behavioural events and how they unfold over time? What research strategy should
be selected when the boundary conditions may not be constant during the sequence of events,
and when there are important serial and parallel processes proceeding simultaneously? With
reference to these questions and the previous elementary discussion of different approaches to
task analysis an advanced case study approach (Campbell, 1984) accompanied by controls that
guard against the construct and external validity deficiencies connected to the practitioner-
based ergonomics approach seems to be the correct strategy.

Opportunities to intentionally replicate case studies should be utilised, that is, the selection of
cases to study should be complementary and allow replication. In order to generalise findings
analytically the psychological constructs needed to satisfy the validity issue, must be added.
If these cannot be found in existing theory they can be anchored in the system analysis. By
making analytical rather than statistical generalisations the replication logic that case studies
rely on, is equivalent to the logic behind the generation of experimental studies (Yin, 1984).

Content analysis of communication data


Communication data are only a sample of evidence needed to infer task related knowledge
and strategies. The size of the sample and the degree of biased selection may vary across
actors and settings. When using quantitative indicators, this generates problems. Is, for
instance, the frequency of utterances monotonically related to the significance of discourse
content? An even more complicated issue is the relation between a verbal utterance and the
cognitive meaning of the sender and the receiver. On the one hand, the relation can be
difficult to determine regarding whether thought is mirrored through speech. In complex
settings of real life communication between parties, the validity concept becomes hard to
grasp. One sensible way to assess validity for communicative tasks is to use the problem
building phase (Simon, 1992) as a reference point, and the circumstantial validity evidence
becomes the degree of convergence between preparation and conversation.
234 J Wikman

On the other hand, the same utterance is very likely to vary in regard to the relative
importance that different actors attach to it in and also across different situations. The
interlocutors communicate fragments of their partly common representations, what they
regard as important factors and how they perceive the relationship between these factors. The
information received is added to existing representations or used to alter priorities or relations
within these. Thus, performance in tasks where actors have to interact socially not only
depend on the actions of the sender, nor on the responses from the receiver, but on their joint,
situated contribution to the issues discussed.

This yields problems in classifying data in psychologically meaningful categories, required to


describe and explain the task strategies of the actors. Utterances can be objectively recorded
and also classified with high inter-observer agreement as long as well-defined and simple
categories are being used. This is the case when the classification follows a preliminary task
description governed by system concepts and unequivocal organisation principles.

Conclusions
The research strategy suggested in this paper and applied in two empirical studies by
Strangert et al. (1996), and Wikman and Strangert (1996) has been case-studies in field
settings. Some rather severe theoretical and methodological problems in task analysis have
been outlined in this paper, and the suggested approach must be able to handle the
deficiencies outlined. This approach does not capture the subjective meaning of the
interlocutors. Still the results from our empirical studies confirm our approach with case-
study based task analysis as a promising initial step towards a general explanation of
communication in complex tasks.

References
Campbell, D.T. (1984) Foreword. In R.Yin. Case study research: Design and methods.
(Beverly Hills: Sage Publications)
Cook, T.D. & Campbell, D.T. (1979). Quasi-experimentation: Design & analysis issues for
field settings. (Boston: Houghton Mifflin Company)
Hendricks, H.W. (1995). Future directions in macroergonomics. Ergonomics, 38, 1617–1624
Katz, D., & Kahn, R.L (1978). The social psychology of organisations. (New York: John
Wiley)
Neisser, U. (1976). Cognition and reality: Principles and implications of Cognitive
Psychology. San Francisco: W.H.Freeman & Company.
Rasmussen, J. (1993). Analysis of tasks, activities and work in the field and in laboratories.
Le Travail humain, 56, 133–155
Simon, H.A. (1992). What is an explanation of behaviour? Psychological Science, 3, 150–161
Strangert, B., Wikman, J., & Strangert, C. (1996). Analysis of communication strategies in
Systems Inspections. (Umeå University, Department of Psychology)
Wickens, C.D. (1992). Engineering Psychology and Human Performance, 2nd. ed. (New
York: Harper Collins Publishers Inc.)
Wikman, J., & Strangert, B. (1996). Task uncertainty in systems assessment. (Umeå
University, Department of Psychology)
Yin, R. (1984). Case study research. Design and methods. (Beverly Hills: Sage
Publications)
INFORMATION
SYSTEMS
HEALTH AND SAFETY AS THE BASIS FOR SPECIFYING
INFORMATION SYSTEMS DESIGN REQUIREMENTS

Tom G Gough

Information Systems Research Group


School of Computer Studies
University of Leeds
Leeds, LS2 9JT

Health and safety at work has been a concern over many years most
recently in respect of the users of computer-based information
systems. However, developers of such systems still seem to see
health and safety as someone else’s responsibility. Little attempt has
been made to incorporate consideration of health and safety into the
information systems design process. An examination of health and
safety in the workplace suggests that health and safety issues have
implications for all aspects of information systems development and
that guidelines on health and safety may provide an appropriate
basis for the initial specification of design requirements and lead to
the construction of systems which do not put the health and safety of
their users at risk.

Introduction
Despite the fact that health and safety at work has been a concern over many years,
most recently in relation to the use of computing equipment, information systems
developers still seem to see health and safety as someone else’s responsibility.
Although information systems developers would find it hard to disagree with the
contention that one aim of information systems development should be to produce
information systems which do not put the health and safety of their users at risk, it
is not obvious that much effort has been expended generally to incorporate
consideration of health and safety into the literature on the information systems
design process.

This paper begins with a brief review of the various aspects of health and safety in
the workplace to which information systems developers should be paying attention,
if the systems they build are to ‘fit safely’ into the context in which they are to be
used. It then briefly examines the assertion that insufficient attention is paid to
Health and safety in information systems design requirements 237

issues of health and safety by the proponents of various information systems


development methodologies.

The implementation of the European Union Directives in the UK will be used to


provide the design baseline for an exploration of whether guidelines on health and
safety could be used as the basis of an initial specification of information systems
design requirements. The paper concludes with a preliminary assessment of the
implications of such an approach for the theory and practice of information systems
development.

Health and Safety—Issues and Implications


Health and safety issues in the workplace may be grouped into four general
categories: those associated with the workplace itself; those identified as related to
fatigue; those that are stress related; and the risks associated with the use of
workstations that incorporate VDUs. Many of these issues are interrelated and need
to be addressed together, even if discussion is nominally about a single issue. Most
recent attention has been focussed on the problems that are perceived to be
associated with workstations and the use of VDUs. This is largely because the
growth in the use of VDUs and, in particular, keyboards linked to VDUs, has
resulted in an increase in the claimed incidence of repetitive strain injury (RSI).

An initial assessment of the health and safety issues outlined above suggests that
the implications for information systems design cover the whole of the process
from the analysis to installation of computer-based information systems in that the
issues identified cover all aspects of the design process, if they are to be effectively
addressed.

Information Systems Design Methodologies


If the contention that attention needs to be paid to health and safety throughout the
information systems design process is true, it would be reasonable to expect that
consideration of the health and safety issues would be a feature if information
systems development methodologies. It is obviously unlikely that any designer of a
computer-based information system would set out to build a system that put the
health and safety of the users of such a system at risk. However, as noted earlier,
little attempt is made to address health and safety in the literature on how to build
information systems. Any discussion tends to be fragmentary or even non-existent.
This is perhaps to be expected since the literature on both theory and practice leaves
the reader with the impression that health and safety is not the responsibility of the
information systems designer.
238 TG Gough

This criticism of insufficient attention to health and safety is valid across the
range of information systems methodologies. A review of a sample set of
methodologies to support this assertion will be found in Gough (1991). A
subsequent review in Gough (1995) showed little improvement over that in Gough
(1991) despite the intervening implementation of the EU Directives on health and
safety requirements for work with display screen equipment.

There is little sign of improvement despite the fact that there has been legislation in
place to implement the Directives and that there has been a considerable increase in
the number of reported cases of ‘RSI’. For example, not one of some 40 papers at a
recent conference on information systems methodologies (Jayaratna and
Fitzgerald, 1996) addressed to “…issues ranging across the spectrum from social to
technical concerns” identifies health and safety as a key issue. Smith (1997)
contains no more than a brief reference to health and safety despite its focus on “the
major user issues in information systems development”. In a text aimed a providing
a foundation course in information systems (Avison and Shah, 1997), there appears
to be no reference to health and safety. Robson (1997) appears to simply repeat the
limited discussion in the first edition (Robson, 1994) suggesting that the author sees
no need to encourage information systems developers to regard health and safety as
integral to the information systems design process.

These three brief reviews with their illustrative examples are offered to support the
claim that the information systems development community in general sees health
and safety as not the responsibility of the information systems designer in terms of
the process of design.

The Setting and Application of Minimum Standards


The European Union Directives (agreed on 29 May 1990), incorporated into Article
118A of the Treaty of Rome, were aimed at improving the health and safety of
workers in the workplace and at ensuring that all Member States have
comprehensive and comparable health and safety legislation. In the UK the
Directives were implemented via new regulations and codes of practice issued
under the Health and Safety at Work etc Act of 1974 by the Health and Safety
Commission (1992) supported by Guidance on the Regulations from the Health and
Safety Executive of which that on Display Screen Equipment (Health and Safety
Executive, 1992) has particular relevance to the information systems design
process.

The ‘Guidance’ (Health and Safety Executive, 1992) covers all the nine regulations
addressed to display screen equipment as well as the Schedule (which draws on the
minimum requirements set out in the relevant European Union Directive) and an
Annex containing detailed guidance on workstation minimum requirements.
This detailed guidance is concerned with both the physical characteristics of
Health and safety in information systems design requirements 239

the workplace and the workstation, and with the systems to be used by the
workstation user.

Increasing emphasis is rightly being placed the importance of all the people
engaged in the information systems development process and it is now widely
recognised that ‘people factors’ have a greater influence on the success or failure of
computer-based information systems than technical factors. However, this
recognition of the centrality of the potential user of information systems has not yet
been reflected in the advice provided by the proponents of the range of information
systems development methodologies. These are still focussed on the data, the
processes or the task, the latter in terms of task content rather than the implications
of the ‘doing’ of the task for the person engaged in it. The continuation of these
approaches, even if supported by increased user involvement (as widely
advocated), seems unlikely to reduce the incidence of the problems associated with
the use of VDUs. Similarly, more attention is being paid to the importance of the
organisational context in the effective operation of computer-based information
systems. Little attention is addressed to the physical context in which such systems
operate, apart from the occasional discussion of the physical components of the
workstation. Continuing to ignore the other aspects of the environment in which
users work is likely to lead to the users continuing to experience VDU-related
problems since many health and safety issues are interrelated and all of them ought
to be addressed within the information systems design process.

Using the advice offered by the Health and Safety Executive (1992) as a template
for the initial specification of requirements and a basis for the implementation and
installation of the resulting information system would offer the opportunity to build
‘better’ systems. If the advice on the health and safety issues were followed within
the specification process, the resulting system built to meet such a specification
could reasonably be expected not to put the health and safety of its users at risk. The
arguments in favour of the efficacy of such an approach are the same as those for
ensuring that audit and security provisions are integral to information system
specification rather than less effective (and more expensive?) later post-
implementation additions. Insisting that the physical context is given equal weight
with the organisational context would require designers to be aware of the physical
implications of their design for their clients in the workplace, not merely as a visual
representation on a screen. Designers would need to re-acquire the broader
understanding of the systems environment which was lost with the adoption of a
narrow technical systems analysis approach in the late sixties and has never been
fully recovered. Designers would then be in a good position to assist their clients
with advice on making the physical environment one which did not put the health
and safety of its occupants at avoidable risk.
240 TG Gough

Conclusion
Health and safety issues still appear to the majority of information systems
designers to be someone else’s responsibility and the long-standing concern about
health and safety in the workplace, underlined by the recent legislation in relation
to the risks associated with the use of computer-based information systems, does
not seem to be reflected in the theory and practice of information systems. If,
however, information systems designers were to use the advice available in support
of the recent legislation as the starting point for requirements specification, as input
to the implementation of such specifications, and as part of the planning for the
installation of the resulting systems, such systems would contribute to the
production of healthier and safer workplaces and reduce the incidence of the
problems associated with VDU usage.

References
Avison, D.E. and Shah, H.U. 1997, The Information Systems Development Life
Cycle; A First Course in Information Systems, (McGraw-Hill, London).

Gough, T.G. 1991, Health and Safety Legislation—Threat or Opportunity. In M.


C.Jackson, G.J.Mansell, R.L.Flood, R.B.Blackham and S.Y.E.Probert (eds.)
Systems Thinking in Europe, (Plenum Press, New York), 171–175.

Gough, T.G. 1995, Health and Safety Legislation Implications for Job Design. In
S.A.Robertson (ed.) Contemporary Ergonomics 1995, (Taylor and Francis,
London), 446–450.

Health and Safety Commission 1992, Approved Code of Practice—Workplace


(Health, Safety and Welfare) Regulations 1992, (HMSO, London).

Jayaratna, N. and Fitzgerald, B. (eds.) 1996, Lessons Learned from the Use of
Methodologies, (British Computer Society and University College Cork,
Cork).

Robson, W. 1994, Strategic Management and Information Systems—An Integrated


Approach, (Pitman Publishing, London).

Robson, W. 1997, Strategic Management and Information Systems—An Integrated


Approach Second Edition, (Pitman Publishing, London).

Smith, A 1997, Human-Computer Factors: A Study of Users and Information


Systems, (McGraw-Hill, London).
COGNITIVE ALGORITHMS

Ronald Huston, Richard Shell, and Ashraf Genaidy

Department of Mechanical, Industrial and Nuclear Engineering


University of Cincinnati, Cincinnati, OH 45221–0116, USA

This paper presents a new procedure, using cognitive algorithms, for


studying complex human-based systems. The procedure uses fuzzy
logic, linguistic variables, and participant (worker) expertise to
establish the fundamental characteristics of the systems. These
characteristics then provide for the identification of system norms
which in turn lead to standards and a basis for system modification
and optimization.

Introduction
The traditional methods of characterizing, evaluating, and regulating work systems
have become increasingly ineffective. They simply cannot cope with the
complexity brought on by the technological advances and the higher productivity
demands. There is a need for new intelligent systems which can effectively relate to
these increased complexities.
The objective of this paper is to provide a basis for the development of such
systems. This is accomplished through a discussion of the following concepts:
system complexity; cognitive algorithms; and experiments.

System Complexity

Definition
The concept of complex systems has been succinctly stated by Lewin (1993) as: the
phenomena that out of the interaction of individual components at a local level
emerges a global property which feeds back to influence the behavior of the
individual components”. For example, the interaction of species within an
ecosystem might confer a degree of stability within it. Stability in this context is an
emergent property. As another example, in economics, the aggregate interaction of
manufacturers, distributors, marketers, consumers, and financiers forms a modern
capitalistic system. A more abstract example is the relation between knowledge and
wisdom. Here knowledge is analogous to the “local interaction” and wisdom to the
242 R Huston, R Shell and AM Genaidy

“emergent global property”. In this context, the interaction of different entities of


knowledge gives rise to wisdom which utilizes the knowledge base. Such utilization
in turn expands the knowledge base thus producing dynamic interaction
contributing to system complexity.

Domains of Complexity
Weaver (1948) characterized complexity in terms of three regions: 1) organized
simplicity; 2) disorganized complexity; and 3) organized complexity. “Organized
simplicity” describes systems with only a few variables (or parameters)—usually
only one, two, or three—and a high degree of determinism. Such systems
dominated analyses in physical sciences prior to 1900. An example is Newtonian
mechanics of one or two particles.
At the other extreme “disorganized complexity” describes systems with very
large numbers of variables, regarded as random variables. To study such systems
analysts have formulated variousprobalistic and statistical theories.
According to Weaver, however, there exists a great middle region between these
extremes which he categorized as “organized complexity”. In this region there is a
sizable number of variables which are integrated into an organized whole. To study
these systems analysts have developed some relatively new techniques including
operations research, linear programming, integer programming, fuzzy logic, neural
networks, bond graphs, chaos theories, and genetic algorithms.
Our contention is that organized complexity can be expanded to include the two
extremes. Specifically, organized simplicity is a special case of organized
complexity and it is questionable that “randomness” even exists in the physical
world. In 1992 Kosko strongly questioned the presence of randomness in the real
world, indicating that uncertainty aspects of complexity are deterministic in nature.
Alternatively, Prigogine (1985) suggests that reality lies somewhere between
determinism and randomness.

Use of Fuzzy Logic and Linguistic Variables


One method of measuring complexity is to employ the principles of fuzzy logic and
fuzzy sets developed by Zadeh (1965) and others. Zadeh envisioned the application
of fuzzy sets with human based systems such as economic and ergonomic systems
(as in the research here). Fuzzy logic is based upon a premise of “incompatibility”
which states that “as the complexity of a system increases, our ability to make
precise and yet significant statements about its behavior diminishes until a
threshold is reached beyond which precision and significance (or relevance)
become almost mutually exclusive” (Zadeh, 1973).
Fuzzy logic employs linguistic variables to describe the contents of a fuzzy set.
According to Zadeh (1975) a linguistic variable is an entity whose values are not
numerical but instead quantitative words or phrases in a natural language. The use
of words or phrases has the advantage of being less specific than numerical
variables and thus being easier to apply when describing complex systems.
Linguistic characterization of system complexity is especially useful with
human based systems since people tend to think and act in such descriptive terms as
Cognitive algorithms 243

“high and low”, “hot and cold”, “far and near”, “fast and slow”, “heavy and light”,
etc. In an application of these concepts Tichauer (1978) transformed numerical
lifting values into ranges defined by linguistic variables. This procedure arguably
led to a more meaningful interpretation of results.

Cognitive Algorithms
A cognitive algorithm is a mental procedure for solving a problem. It is a product of
human thinking and reasoning about a problem. Humans think in “analog” instead
of “digital” terms, using “continuous” instead of “discrete” variables. This means
that information is processed in “chunks” or “segments” as opposed to specific
data. The edges or boundaries of these chunks of information are often ill defined,
but this allows the application of fuzzy set theory as exposited by Zadeh (1973).
Our thesis is that cognitative algorithms can be used in solving problems with
complex systems. The best resource for establishing these algorithms are people
with “expertise” with the system being studied. We discuss expertise and the
derivation of the algorithms in the following two subsections.

Expertise
Perhaps the major distinction between experts and non-experts (or novices) in a
given field is that experts confront problems with “skill-based” behavior whereas
novices confront problems with “knowledge-based” behavior. That is, experts will
solve routine problems almost subconsciously, or automatically, using skills
acquired by repeated solving of the same type of problems. Alternatively, novices
approach the same problems consciously, relying on whatever knowledge they may
have about the problem and its solution.
When asked to describe this skills-based approach to problems experts will
typically respond by saying, “I don’t know. I just do it.” This does not mean that
experts approach problems capriciously. It simply means that the expert has solved
the problem at hand so many times, that the knowledge required is stored in
patterns in long-term memory, which is retrievable without conscious mental effort.

Development of Cognitive Algorithms


The “input” and “output” variables of cognitive algorithms will be linguistic
variables. These variables must be capable of ranging between the extremes of all
possible variables and states encountered in the workplace (for example “light” to
“heavy”). The algorithms will convert the input or independent variables into the
desired output or dependent variable. Within this context, a cognitive algorithm is
defined as an aggregate of linguistic operations which establishes a rational
relationship between the input and output variables.
For example, the dependent variable might correspond to the safety behaviors
essential to minimizing the risk of adverse health effects during performance of
various tasks. Experts would then enable the algorithm development through their
experimental assessment of the linguistic variables.
244 R Huston, R Shell and AM Genaidy

It should be noted that, during the experimental assessment, information


regarding input variables should not be randomized when presented to the experts.
Instead the information should be presented in an organized fashion as in ascending
or descending order. Expressed more succinctly: “Don’t randomize, organize.”

Experimental Validation
An experiment was conducted on thirty male workers engaged in infrequent lifting.
Twenty-nine of the workers had at least five years of experience. One had forty
years. But one worker had only six months of experience.
Each worker was asked, based on his knowledge and experience, to assess the
effects of load and horizontal distance on lifting effort for three heights of life.
Input variables were then the loads and the horizontal distance. Each of these
variables were assigned three values. The load values were “light”, “medium”, and
“heavy”. The distance values were “close”, “medium”, and “far”. The output
variable was assigned one of nine linguistic values.
The results showed that lifting effort increased with an increase in either load or
horizontal distance. These findings are consistent with those reported in the
ergonomics literature. The results of this preliminary study indicated that the load
effect on lifting effort is more pronounced than that of horizontal distance, although
in a few cases, workers recognized both input variables as equally important.

Conclusions
To summarize, the concept of cognitive algorithms with the use of linguistic
variables provides a means of studying a wide variety of complex human-based
systems. The lifting experiments validate the approach. But, more needs to be done
before the procedure is firmly established. Specifically, a system of linguistic
mathematics needs to be developed and implemented to produce more precise
relationships between the input and output variables, incorporating the nonlinear
effects occurring at extreme values of the input variables.

References
Kosko, B., 1992, Neural Networks and Fuzzy Systems—A Dynamical Systems
Approach to Machine Intelligence, (Prentice-Hall, Englewood Cliffs, New
Jersey).
Lewin, R., 1993, Complexity: Life at the Edge of Chaos, (McMillan Publishing
Company, New York).
Prigogine, I., 1985, New perspectives on complexity. In The Science and Praxis of
Complexity, (The United Nations University, Tokyo), 107–118.
Tichaeur, E., 1978, The Biomechanical Basis of Ergonomics: Anatomy Applied to
the Design of Work Situations, (Wiley-Interscience Publication, New York).
Weaver, W., 1968, Science and complexity, American Scientist, 36, 536–544.
Zadeh, L.A., 1965, Fuzzy set, Information and Control, 8, 338–353.
Cognitive algorithms 245

Zadeh, L.A., 1973, Outline of a new approach to the analysis of complex systems
and decision processes, IEEE Transactions on Systems, Man and
Cybernetics, SMC-3, 28–44.
Zadeh, L.A., 1975, The concept of a linguistic variable and its application to
approximate reasoning-I, Information Sciences, 8, 199–249.
DESIGN METHODS
RAPID PROTOTYPING IN FOAM OF 3D ANTHROPOMETRIC
COMPUTER MODELS IN FUNCTIONAL POSTURES

Siel Peijs, Johan J.Broek and Pyter N.Hoekstra

Delft University of Technology


Faculty of Design, Engineering and Production
Subfaculty of Industrial Design Engineering
Jaffalaan 9, 2628 BX Delft, the Netherlands

This paper describes the first-stage results of a feasibility study of rapid


prototyping in foam of 3D anthropometric computer models in functional
postures. The following three phases will be discussed: first the creation of a
list of surface points that simulate the anthropometric model’s outside
geometry; next the conversion of these points into B-spline descriptions of
the model’s constituent body-members and thirdly the input of these
descriptions into the available CAD/CAM software to realize the foam
milled model. We will shortly discuss possible next steps that could lead
from the automatic creation of anthropometric computer models using 3D
anthropometric whole body scans of subjects in standardized postures to the
rapid prototyping of these models in functional postures.

Introduction
The last ten years have seen the birth and successive growth and flowering of a new scientific
discipline and accompanying technology: rapid prototyping. Rapid prototyping can briefly be
described as the fabrication of a physical model directly from a 3D CAD design. Where the
production of a prototype with classical techniques might readily take a few weeks this now
can be shortened to days or even to hours: hence the term ‘rapid’ prototyping. At the moment
two main techniques can be discerned. First we have LMT: Layer Manufacturing Technique
(additive fabrication as e.g. Stereolithography) where the physical model is built by adding
material layer by layer. The second technique (subtractive fabrication such as milling, cutting
etc.) is more closely associated with classical techniques in that it removes material step by
step. This last technique is used at the Subfaculty of Industrial Design Engineering in the
SRP (Sculpturing Robot Project), see e.g. Vergeest and Tangelder (1996), where a CAD/
CAM software controlled six degree-of-freedom foam milling robot is applied for the rapid
prototyping of free form surfaces.
Since ADAPS (Anthropometric Design Assessment Program System) is also available to
us to visualize and manipulate 3D anthropometric models on a computer screen into
Rapid prototyping of 3D computer models 249

workspace related functional postures, a feasibility study of the linkage of both projects was
carried out. Results—a foam milled anthropometric model (or selected body parts)—might
be useful for better presentation issues and highlight Man Product Interaction.

Conversion
The following is quite straightforward. What we have is one of the 3D computer models of
AD APS, visualized on a display in a workspace related functional posture. What we like to
end with is an anthropometric model in foam. In between we have a number of conversion
steps that transfer information about the anthropometric model’s outside geometry, via the
definition of surfaces of enclosed body parts into data needed to calculate the milling paths
for the robot tool.

The ADAPS Model


An ADAPS-model consists of a set of linear branched chains, containing twenty-five links or
body members. Relative to these links we define surface-points which, together with a
number of lines between these points, determine the outside geometry of the model (see
Figure 1). A link’s orientation is defined by its joint-angles relative to the connecting, more
proximal link. Changing the orientation of a link will change the absolute position of the
surface-points in space (while keeping constant their relative position to the link) and in this
way the model’s outside shape. We should note here that the lines used to display the
anthropometric model only simulate a 3D entity but do not, as such, automatically form well
defined boundaries of enclosed volumes (see e.g. the regions of the model’s elbows in
Figure 2).

Figure 1. Schematic representation of Figure 2. Lines do not enclose


an ADAPS-model via links, surface volumes in the regions of e.g.
points and connecting lines the model’s elbows
250 S Peijs, JJ Broek and PN Hoekstra

We will come back to this in the next paragraph. All we have to do now in transferring
available surface geometry information of an ADAPS-model to the next phase is the
production of a list of co-ordinates of selected surface-points (and implicitly the knowledge
of what lines should connect them).

Surfaces
The next phase is the construction of surfaces from these ADAPS coordinates with the
objective to recreate (as close as possible) the original ADAPS-model’s geometry. The
definition of 3D surfaces is handled using B-splines. Parameterized B-spline (NURBS)
surfaces are defined by the input of the X-, Y- and Z-coordinates of ‘control’ points in two
directions. These points form a 3D rectangular web of rows and columns. The control points

for the surface part HEAD e.g. are thus selected by the ADAPS coordinates of 5 ( ) times 6
( ) points (see Figure 3).

Figure 3. Definition of control points in two


directions ( •, ) for surface part HEAD

The conversion that now is needed consists of a simple transfer-table linking the point
numbering in the ADAPS-description with the numbering for the control points. In this way
various surfaces (TORSO, L-ARM, R-ARM, L-LEG, R-LEG etc.) can be defined that
represent the constituent body-members of an ADAPS model.
We mentioned earlier that the lines used in visualizing an ADAPS-model do not always
form the boundaries of an enclosed volume. Care had to be taken in the definition of the
surfaces that ensured that all adjacent surfaces form an object that is closed (so it can be
milled). Extra surfaces had to be defined in the pelvic region to close a possible gap between
the torso and the legs. Since the number of surfaces is not a limiting factor the choice was
made to model the nose and the breasts separately, in this way making the modeling of the
head and torso more easy.

SIPSURF
The next phase links the defined surfaces and prepares the data needed for the milling process.
This we can accomplish by using the SIPSURF software that was developed in the Sculpturing
Robot Project SRP. SIPSURF—short for Simple Interactive Program for SURFaces—allows
the definition of the various surfaces and their relation to each other. It enables visualizing the
results as a 3D web, as a surface or as a rendering. Geometric errors can be detected and readily
repaired. When everything is satisfactory a module is started that automatically calculates the
trajectories for the milling robot. The milling can now be left to the hardware: a combination of
Rapid prototyping of 3D computer models 251

a turntable and a six degree-of-freedom foam milling robot. In this way we could start with an
ADAPS model as displayed on a computer screen (Figure 4) and end with the realized object in
foam as depicted in Figure 5. Foam models of 30 cm height were realized in two milling
resolutions. Corresponding milling times are given in Table 1.

Figure 4. ADAPS model as displayed Figure 5. Realized model in foam


on the screen with milling resolution of 0,5 mm

Table 1. Milling resolutions and milling times

Discussion
This feasibility study has shown that the linkage between an ADAPS-model in a functional
posture displayed on a computer screen and a realized object in foam is certainly possible.
Some remarks however have to be made.

- Exerted milling forces in combination with the stiffness of the foam material used
(including the scale of the foam model and the mill resolution) resulted in the need for
an extra supporting surface under the foam model’s left foot (as may be seen in Figure
5), otherwise the foot could break.
- ‘Filling the gaps’ as e.g. between the torso and the legs is at the moment an ad-hoc
procedure depending on the specific functional posture of the ADAPS model. Special
closing techniques like blending and filleting however are available and can be
incorporated in a later stage.
252 S Peijs, JJ Broek and PN Hoekstra

- At the moment there is a tiny gap between the surface model’s breasts and the torso. This
is not noticeable in the foam model (the mill diameter is much larger than the gap) but it
will have to be remedied when producing foam models of bigger size.

We end this paragraph with the following: there is ongoing research in the area of 3D Surface
Anthropometry using Whole Body Surface Scans of subjects in standardized postures—see
e.g. HQL (1995), Daanen (1995), CARD (1996) or Robinette and Daanen (1996). As stated
elsewhere (Hoekstra, 1997) ergonomists are especially interested in predictions of real life
situations. Much research is still needed to derive arbitrary, but functional, postures from scan
data of only a few standardized postures. The possibility to realize true-to-life human models
(or body-parts) in foam—to be incorporated within workspace mock-ups or product
prototypes—might help in highlighting human workspace/product interaction issues.

Conclusions
In this study we have found that rapid prototyping in foam of 3D anthropometric computer
models in functional postures is certainly feasible. Via successive steps of surface geometry
extraction, the definition of parameterized (NURBS) B-spline surfaces and the automatic
calculation of the tool-path trajectories for a foam milling robot, models in two milling
resolutions were realized. Although much work is still needed in fully automating the entire
process, this ongoing project seems very promising in highlighting human workspace
interactions to the designer or ergonomist.

Acknowledgments
We would like to express our gratitude to Bram de Smit for his valuable help with the
production of the foam milled models and to Henk Lok for help with the illustrations.

References
CARD, 1996, Computerized Anthropometric Research and Design Lab, Fitts Human
Engineering Division of Armstrong Laboratories at Wright-Patterson Air Force Base,
Ohio, USA. http://www.al.wpafb.af.mil/~cardlab/
Daanen, H.A.M. 1995, 3D-Oppervlakte Antropometrie, Ned. Mil. Geneesk. Tijdschrift 48,
171–178, e-mail: Daanen@tm.tno.nl
Hoekstra, P.N. 1997, On Postures, Percentiles and 3D Surface Anthropometry, In
S.A.Robertson (ed.) Contemporary Ergonomics 1997, (Taylor & Francis, London),
130–135
HQL, 1995, Research Institute of Human Engineering for Quality Life, Osaka, Japan. http://
www.hql.or.jp/eng/index.html
Robinette, K.M. and Daanen, H.A.M. 1996, CEASAR Proposed Business Plan, Armstrong
Lab. (AFMC), Wright-Patterson AFB, Ohio, USA, personal communication
Vergeest, J.S.M. and Tangelder, J.W.H 1996, Robot machines Rapid Prototype, Industrial
Robot. 23–5, 17–20
The use of high and low level prototyping methods for
product user interfaces

John V H Bonner* and Paul Van Schaik+

Teesside University
Middlesbrough
TS1 3BA

*
Institute of Design +School of Social Sciences

As part of a research project investigating the development of novel user


interfaces for consumer products, two different types of design and
evaluation methods were appraised. These two methods were high and low
level software based prototyping of novel user interfaces for two types of
consumer product. In the first study, we used high level, interactive,
prototypes using a high level of functionality and visual fidelity. With these
prototypes, we conducted a structured series of formal evaluation methods
using thirty subjects. The second study was less structured with a mixture of
low level, local prototypes representing object and behavioural elements of a
product interface. These were tested using much smaller groups of subjects.
This paper compares and contrasts these methods and presents some specific
advantages and disadvantages of each and also problems associated with
both approaches are outlined.

Introduction
The use of interface prototypes to test interactivity is advocated as an integral and vital part of
the development process (Gould and Lewis, 1985; Hix and Hartson, 1993; Newman and
Lamming, 1995). Prototypes can be developed to different degrees of fidelity. Typically these
can be either non-interactive and generally are paper based, or interactive, representing different
levels of fidelity and have been defined here as either low level (where a local or specific element
of the interface design is produced); or high fidelity or high level interactive prototypes (where
all or most of the functionality and often the form of the interface is fully represented).
The use of non-interactive prototypes, which can also be defined as scenario tools (van
Harmelen, 1989; Carroll and Rosson, 1992) is not discussed here, although they do have a
significant role, particularly in the early conceptual stages of an interface design process.
This paper specifically examines the role of high and low level interactive prototypes and
explores the advantages and disadvantages of both approaches along with observations made
during the development of a range of novel interfaces for a washing machine and a
microwave oven.
This research forms part of an EPSRC funded project to develop guidelines and design
tools for the development of novel interfaces for consumer products. The exploration of the
development of novel interfaces is important as consumer products become more
complicated through technological convergence with computers and telecommunications.
The scope of the project was initially limited to the development of design guidelines but our
research has identified the need for design tools far earlier in the development process where
254 JVH Bonner and P Van Schaik

guidelines have not proved to provide effective support (Henniger et al 1995). One important
design tool that we have therefore examined is the use of software-based prototyping
development and evaluation. Two case studies are discussed below.

High level prototypes


In the first study, three types of novel interfaces were proposed. These were based on a range
of criteria including user requirements information on conventional consumer products,
gained from focus group sessions conducted by one of the collaborating partners, along with
technical developments that were anticipated by both of the industrial collaborating partners.
The novel interfaces were:

• Animated Object Display This dialogue style used graphic representations of three washing
programming parameters (spin speed, fabric type and temperature wash) as an animated
object which could be altered using conventional toggle switches to change the colour,
shape and speed of the animated object. This type of interaction dialogue has previously
been used to represent aircraft flight information (Wickens and Andre 1988).
• Drag and Drop This dialogue style was selected to examine the use of finger
controlled ‘direct manipulation’ (dragging and dropping icons on a display panel to
‘design or build’ a washing programme).
• Auditory display Auditory information plays an important role in many mechanically
based products and under utilised in many computer based products. Therefore, we were
keen to see if the addition of auditory displays for functions such as washing cycle status
and control feedback would increase the user’s understanding of product functionality.

The purpose of the study was two-fold: firstly to explore the effectiveness of a range of
design and evaluation methods; and secondly to derive design guidelines related to the
different interaction styles. It is the former objective that is of concern here. One of the design
methods under review was the effectiveness of prototyping as a design tool to develop novel
interfaces whilst also identifying usability problems.
The level of novelty of the proposed interfaces posed an interesting dilemma. In order to
obtain meaningful feedback from any user trials the novel interaction styles needed to be set
in context. Allowing users to explore novel interaction styles without contextual information
would not identify the correct type of usability problems. Furthermore, the type of evaluation
methods we wanted to validate required that the user developed a comprehensive
understanding of the different novel interfaces. The prototypes, therefore, needed, to some
extent, to be representative and recognisable as a washing machine control panel or interface.
This suggested that a high level prototype would be most appropriate.
The prototypes were developed using Macromedia Director (v 4.0), a multimedia
authoring software package. An experienced prototype developer/designer was used and
initial concepts were developed using pen and paper during discussions with the one of the
collaborating partners and then translated into an interactive prototype. The prototype
developer made notes during the development process and recorded any problems or design
decisions needing to be made. In light of this process, the following observations were made.

Advantages
Using a high level prototype proved to be excellent for addressing usability problems using a
top-down perspective. The level of novelty within the different interaction styles prompted a
much larger number of design issues than anticipated, which needed to be resolved. However,
we found very little published interface design guidance which could assist or support in the
decision making process. We found little evidence during the user trials to suggest that users did
not accept the embodiment of the prototype. This was established by using a method known as
‘Teach-Back’ (van der Veer, 1990). Feedback from the users suggested that the prototype itself
Use of prototyping methods for product user interfaces 255

did not obstruct their acceptance of the concepts presented to them. As the functionality of the
interfaces was largely complete, we could perform keystroke data-logging. This allowed us to
set a wide variety of tasks and monitor performance measures, allowing quite subtle differences
in behavioural activity between the different interfaces to be measured.
Finally, the high level prototypes created credibility for the development work; whilst this
does not have any direct impact on the development process, a good visualisation of the
design problem instilled confidence in the design proposals.

Disadvantages
One of the primary objectives of the high level prototypes was to arrive at a working
interpretation of the dialogue styles, in context, as quickly as possible, so that the underlying
design concepts could be assessed. This caused the most significant problem during the
development process. As usability problems arose, design decisions had to made to overcome
them. The literature proved inadequate in many situations in providing an indication of where
a solution may be sought. Most design problems, therefore, could only be resolved by
conducting many low level prototype studies. However, the difficulty of non-contextual
interpretation of the design proposals would then become a problem. An example of this
concerned the behavioural properties of icons on the ‘Drag and Drop’ interface. These icons
could be picked up by the finger and moved around the touch screen. We had little knowledge
of how these should be highlighted, what type of ‘drag’ qualities they should possess, and
how they should indicate that they had arrived at the target area. To overcome this, basic
heuristic rules (Nielsen, 1993) were used to assess the behavioural characteristics, although
in some instances even these were inadequate.
Problems identified further down the development process were often more difficult to resolve,
purely because a good deal of re-coding would have been required. Some of these problems
only manifested themselves when a series of elements of the interfaces became inter-related.
The key rationale for developing a high level prototype was to ensure that the interaction
dialogues were evaluated in context. However, we found that this advantage was also counter-
productive. During the trials subjects were asked to evaluate the novel interfaces against a
‘standard’ washing machine interface (which was also presented in prototype form). We
found that users tended to make direct comparisons with their own appliance at home rather
than using the ‘standard’ interface as the benchmark. This shed doubt on the validity of some
of the findings from the subjective comparative measures that were undertaken.

Low level prototypes


Whilst the first study provided a top-down approach to the development of novel interfaces,
we decided to use a bottom-up approach in the second study to allow for comparisons
between the two. In the second study, one of the novel interaction styles (Drag and Drop) was
developed further but using a different application domain. For this study, a novel microwave
control panel was developed. The approach was to develop alternative design proposals of the
interface using low level prototypes and provide exposure of the conceptual design elements
to users, creating more immediate feedback.
An existing microwave oven (upon which the novel interface was based) was assessed in
terms of its functionality and usability. In discussion with one of the collaborating partners, a
new interface was designed using a series of paper prototypes. At this stage a series of
interaction design issues emerged and it was agreed that these would be resolved using a set
of low level prototypes. These were: alternative methods for selecting icons; presentation
styles for animated icons; dragging behaviour for moving icons across the touch panel;
making multiple icon selections; and iconic representation
User trials were conducted using very small samples (usually 5–6 subjects) and
alternatives were presented in pairs where the user had to select one of them against three
assessment criteria. As the subject groups were so small a paired comparison test could not be
256 JVH Bonner and P Van Schaik

conducted so design solutions were based purely on frequency counts. An amalgam of these
local prototypes was then used on another local prototype which offered more functionality.
This prototype is shortly to be evaluated again using small user groups before being
developed into a high level prototype.

Advantages
This approach meant that there was less investment in key concepts or design proposals
allowing for more diverse ideas to be considered. The ‘fluidity’ of the design concept could
remain for a longer period of time. There was also a higher degree of impartiality about the
design process where a design problem could be decided through a user trial rather than
attempting to use heuristic evaluation methods. The decision making process was generally
quick once a format for the development of the prototypes, and the setting up of the user
trials, had been established. This approach did prompt other design solutions to emerge and in
one case, it was possible to merge two favoured design solutions together as they were
mutually compatible. However, we do not know whether there is any confounding
interference by adopting this approach.

Disadvantages
By identifying a series of design problems and resolving them through low level prototypes,
there appeared to be a lack of coherence in the design process. This became more apparent
when the relationships between the interface elements were combined. For example, we
found that the preferred icon for ‘auto-defrost’ was presented in a static format. Introducing
this to the next level of prototyping where some animation was then being considered, caused
problems as we felt that a different, but less preferred example, would be more appropriate
within this broader context.
As anticipated in the previous study we did find a lack of contextual relevance which
prevented users making informed decisions about the design proposals presented using the
low level prototypes. One user even abandoned the trials as she did not understand what was
expected of her or what the purpose of the trial was. In addition, many subjects hinted that
they were making arbitrary decisions about the usability of the different design proposals
because of lack of context. Obviously these type of trials have little reliability and validity
although there is evidence to suggest that small user groups can, if the trials are conducted in
the right manner, reveal a high level of usability problems (Thomas, 1996)

General Observations
Many of the advantages and disadvantages of high and low level prototyping are well
reported, for example Rudd et al (1996) offer advice on the applicability and resource
implications for both approaches. Although our findings support these general advantages
and disadvantages, we found that neither approach supported the following important
methodological issues which needed satisfying during the development process of these two
studies.
Neither approach offered a satisfactory solution to enabling contextual usability to be
incrementally assessed, which is important in the evaluation of consumer products. Also, both
approaches were unsatisfactory in handling the effect of interrelated user interface elements
(for example the interaction between icon representation and icon behaviour) both in terms of
development and evaluation. In practice, high level prototypes do allow these inter-related
elements to be evaluated but at the cost of committing too many design decisions early in the
conceptual development process. Conversely, low level prototypes allow more ‘fluidity’ in
the design process but lack contextual relevance and the ability to resolve usability problems
at a high level of abstraction from the interaction process.
These methodological concerns may be symptomatic of developing novel interfaces
where less is generally known about specific interface design principles. This would suggest
Use of prototyping methods for product user interfaces 257

that interface principles for established types of user interface, rather than for novel
interfaces, within the field of HCI have matured to a degree where conventional dialogue
style design principles are well understood and therefore it is possible to determine more
distinct roles for both approaches to prototyping.

Conclusions
Our findings suggest that in order to support both the creative development of novel
interaction styles and interface designs while also explicitly addressing usability issues
requires a design tool or guidelines that offers support on how and when to use high and low
level prototyping to achieve contextual validity. On balance, we found that pushing towards a
high level prototype provided far more important contextual usability problems. What was
needed was ‘time-out’ from this process to conduct low level prototypes that would reveal
valid results to support the development programme.
It is the integrated approach of these two approaches that we hope to review as part of our
design tools and guidelines development programme.

References
Carroll, J.M. and Rosson, M.B. (1992), Getting around the task-artefact cycle: how to make
claims and design by scenario. ACM Transactions on Information Systems, 10(2), pp
181–212.
Gould, J.D., Lewis, C.H., (1985) Designing for Usability—Key Principles and What
Designers Think, Communications of the ACM, 28, pp 300–311
Harmelen, M., van, (1989) Exploratory User Interfaces Design Using Scenarios and
Prototypes, In Sutcliffe A. and Macaulay L. (eds). People and Computers V:
Proceedings of the Fifth Conference of the British Computer Society (Cambridge),
Cambridge University Press pp. 191–202
Henninger, S., Haynes, K., Reith, M.W., (1995) A Framework for Developing Experienced-
Based Usability Guidelines. Proceeding of the Symposium on Desigining Interactive
Systems (DIS ‘95), Ann Arbor MI, pp 43–53
Hix, D., Hartson. H.R., (1993) Developing User Interfaces: Ensuring Usability Through
Product and Process, J Wiley and Sons, New York.
Rudd, J., Stern, K., Scott, I., (1996) Low High Fidelity Prototyping Debate, Interactions,
January Vol 3.1 pp 76–85
Newman W.M., Lamming, M.G., (1995) Interactive System Design, Addison—Wesley
Reading, UK.
Nielsen, J., (1993) Usability Engineering, AP Professional, Boston.
Thomas, B., (1996) ‘Quick and Dirty’ Usability Tests, In (eds) Jordan, P.W., Thomas, B.,
Weerdmeester, B.A., and McClelland, I.L., Usability evaluation in Industry, Taylor
and Francis Ltd, London.
Veer, G.C., van der (1990). Human-computer interaction: learning, individual differences,
and design recommendations. Alblasserdam: Haveka
Wickens, C D & Andre, A D (1988) Proximity compatibility and the object display,
Proceedings of the Human Factors Society 32nd Annual Meeting, Santa Monica, CA.
pp 1335–1339
CREATIVE COLLABORATION
IN ENGINEERING DESIGN TEAMS

Fraser Reid, Susan Reed, and Judy Edworthy

Department of Psychology
University of Plymouth
Plymouth, PL4 8AA
United Kingdom

We analysed video recordings of engineering design team meetings for the


occurrence of visual and non-visual design reasoning, drawing space activity,
and conversational grounding. Three interactional patterns were observed: (a)
designers evolved complementary specialisms, either generating visual ideas,
or focusing on customer requirements and design constraints; (b) non-visual
design reasoning was highly interactive, whilst design visualisation consisted
of uninterrupted bursts of design ideas from single individuals; (c) conversational
grounding was initiated by the speaker during visualisation sequences, but by
the listener in non-visual sequences. These interactional patterns have major
implications for the design of virtual workspaces, and we evaluate the concept
of “seamless collaboration media” in the light of these results.

Shared workspaces and interactive design


The use of shared workspaces has become a major focus of research interest as developers
strive to build real-time computer systems capable of supporting designers working at a
distance from each other. These virtual workspaces typically combine a variety of media, and
might include live video images, electronic whiteboards, structured drawing tools, hypertext
editors, and other tools. The aim is to allow people in different locations to see, point to,
sketch, or write down design ideas whilst simultaneously holding a conversation with their
colleagues. Support for interactive sketching is now widely regarded as an essential
component of virtual workspace systems, and this rests on three assumptions concerning the
role of sketching in collaborative design. The first is that freehand sketching stimulates visual
imagination, and allows the designer to capture and manipulate emergent visual ideas.
Secondly, sketches provide designers with a common task focus and an expressive medium
through which to externalise and communicate design ideas. Thirdly, collaborating through
sketches encourages designers to build on each others’ ideas, and to combine them to form
novel design solutions. In short, interactive sketching is assumed to play an indispensable
role in the design process by providing a creative forum in which designers can explore a
problem space and reason interactively about new design solutions. This reasoning lies
behind the development of “seamless collaboration media” that integrate interpersonal and
group processes within a shared workspace (eg. Ishii et al, 1995).
However, the collaborative process of building on and interactively developing creative
designs is as yet poorly understood. Tang’s (1991) milestone study of collaborative drawing
Creative collaboration in engineering design teams 259

highlights the importance of simultaneous visual access to the drawing surface, but sheds
little light on how designers support and contribute to each others’ design reasoning. In this
paper, we describe an observational study which examines how co-located teams of design
engineers incorporate freehand sketches into the interactive process of design reasoning.

Design reasoning in engineering team meetings


Our data was gathered from six teams of engineering design students working on realistic
design briefs as part of their final year of training at the University of Plymouth. Each team
consisted of about six designers drawn from different specialisms (mechanical and materials
engineering, manufacturing, design technology, etc.). Half of the teams designed a
lightweight, low-cost portable river crossing system for use by aid workers, whilst the
remaining teams designed an electrically adjustable changing bed for use in schools and
clinics. Each team was presented with customer requirements, market analyses, supplier and
manufacturing information, and large-format paper pads and sketching materials. Using
unobtrusive wide-angle and overhead cameras, we collected time-stamped video recordings
from the first meeting of each team, and focused specifically on active design episodes-
extended interchanges in which potential designs were formulated, developed, and evaluated.
Each episode was divided into simple speech units corresponding to the design ideas they
conveyed, or to the conversational functions they carried out.
Speech units were then coded to produce three separate event sequences for each team.
Firstly, where speech units were accompanied by actions (sketching, writing, pointing, or
gesturing) performed on or over the shared drawing surface, these actions were coded and
recorded. Secondly, speech units associated with conversational grounding (Clark and
Brennan, 1991) were coded as grounding requests if the speaker sought verbal or nonverbal
evidence to confirm that a listener understands an utterance, or as grounding offers if the
listener provided unsolicited verbal or nonverbal evidence that they had (or had not)
understood a speaker’s utterance. Thirdly, speech units conveying design reasoning were
classified as visual arguments if visible workspace actions were necessary to their
communication, and as non-visual arguments if this was not the case. Visual arguments
typically conveyed potential design solutions or solution fragments, whilst non-visual
arguments mostly referred to customer requirements, materials specifications, or other
constraints on the design. To assess reliability of coding, a randomly selected team protocol
(constituting a 10% sample of the pool of speech units) was independently coded by two of
the authors. An overall intercoder agreement of 83% was obtained, with significant kappa
agreement coefficients for design reasoning, κ=.75, and conversational grounding, κ=.76.
To our surprise, we found that visual reasoning necessitating the use of a shared
workspace was relatively infrequent in our design teams, even during the active design
episodes selected for detailed analysis. Of the 3225 speech units in the sample, only 829
(25.7%) were classified as visual arguments, whilst 1391 (43.1%) were classified as non-
visual arguments. This pattern was consistent over teams: non-visual arguments were
significantly more numerous than visual arguments in five of the six design teams (χ2 ranging
from 6.20 to 68. 96, all with df=2), with one team showing a similar, but marginally non-
significant (χ2=5.06) pattern. Almost all (791) of the visual arguments in the sample were
accompanied by visible workspace activity, mainly sketching (383 units), and pointing to
sketches (280 units). Furthermore, use of the shared workspace was not confined to visual
design reasoning: over a quarter of the non-visual arguments (380 units) were accompanied
by workspace activity, in this case mainly pointing (183 units) and gesturing (167 units).
We also found—as Goldschmidt (1995) would have predicted—that the designers in our
study adopted specialised and complementary roles within the design process, each
260 F Reid, S Reed and J Edworthy

contributing certain kinds of design reasoning to the whole. Chi-square tests revealed that in
five of the six design teams, it was the most active member that specialised in producing
visual design ideas (χ2 ranging from 11.09 to 53.73, dfs between 4 and 6), using the shared
workspace to gesture, sketch, and point to sketches. Other team members supported the
visualisation specialist by focusing on customer requirements, design constraints. Just one
team deviated from this pattern, but even here we found evidence of role specialisation, with
the most active member emphasising non-visual arguments (χ2=50.41, df=6), and two other
designers sharing responsibility for producing visual design ideas.

Interactive design and conversational grounding


We then investigated how the engineers in our study coordinated their design ideas with those
of their colleagues. One possibility we explored was that teams would engage in interactive
design visualisation, working simultaneously to visualise potential solutions, combining these
to arrive at an acceptable design. Alternatively, our teams might prefer to exchange visual and
non-visual arguments on an alternating basis, with visualisation specialists generating new
ideas which others in the team promptly evaluate. To explore this, we analysed transitions
between the design arguments produced by the engineers, and compared these with a statistical
model computed from the baseline rates for arguments produced by each team. The results of
this analysis are shown in Figure 1(a). In this analysis, the identities of individual designers
are unimportant: instead, we focused on transitions between arguments produced by the same
speaker (S-S), and argument transitions between different speakers (S-O). Solid links indicate
transitions averaged over the six teams that significantly exceed those predicted by the model,
whilst the lighter links indicate chance level transitional probabilities. Links that are significantly
weaker than those predicted by the model have been omitted from the figure.

Figure 1. (a) Transitions between visual arguments (Vs, Vo), non-visual


arguments (Ns, No), and other speech units (Os, Oo) (b) Grounding requests (Res, Reo)
and offers (Ofs, Ofo) associated with visual arguments (c) Grounding requests and
offers associated with non-visual arguments
Two quite distinct interactional patterns emerged from this analysis. Firstly, argument transitions
between designers were more likely to occur during non-visual reasoning sequences (Ns-No;
φ=.14, p<.01). These sequences consisted of brisk, highly interactive exchanges of single-
argument speaking turns. In contrast, visualisation sequences involved extended speaking turns
by individual designers, consisting of chains of visual arguments and other utterances (VS-VS;
φ=.23, p< .01:VS-OS; φ=.08, p<.05), accompanied by sketching, pointing, and figural gesturing.
Creative collaboration in engineering design teams 261

Inspection of these chains revealed uninterrupted turns of up to six arguments to be common,


with the longest chain consisting of eleven consecutive visual arguments. Surprisingly, we could
find little evidence of interactive visualisation, or of alternation between the visual and non-
visual arguments of visualisation specialists and other team members.
Visualising a design solution involves developing novel—or at least initially unshared—
ideas, and communicating these to colleagues requires effort and persistence. We therefore
expected our designers to request and offer positive evidence of conversational grounding
more frequently during visual than non-visual argument sequences. Lag sequential analysis
was used to test this idea. Because grounding information can also be conveyed by a design
idea (eg. by one designer completing another person’s utterance, or expanding on their ideas),
we tested whether grounding occurred more or less frequently at the same time as visual and
non-visual design arguments (lag 0), as well as immediately following them (lag 1). The
results of these analyses are shown in Figures 1(b) and 1(c). Again, solid links are those that
occur significantly more frequently than expected by chance. In line with our expectations,
designers requested evidence of grounding significantly more frequently than chance after
presenting a visual argument (VS-RS, lag 1; φ=.10, p< .01). However, they also offered
evidence of grounding more readily on hearing a colleague present a non-visual argument
(Ns-Ofo, lag 1; φ=.14, p<.01). Evidently, the initiative for establishing grounding resides with
the speaker in visual sequences, but with the listener in non-visual sequences.
Our results suggest that visual argumentation, and the workspace activity that typically
accompanies it, discourages turn transitions, leaving the speaker with the initiative either to
continue their speaking turn, or to invite a colleague to speak. In contrast, non-visual
argumentation is highly interactive. One explanation for this is that the management of
conversational turn-taking differs in the two modes of design reasoning. The introspective
character of visualisation sequences, together with their external focus on sketches or other
workspace materials, may signal a special state of conversational disengagement that
suppresses the transition relevance of the clausal, phrasal, or lexical boundaries that
ordinarily signal turn completion (Goodwin, 1981). Furthermore, the direction of the
speaker’s gaze is an important turn-taking cue, but only when speech is disfluent and hesitant,
or when the baseline level of speaker gaze is low (Beattie, 1979). These are exactly the
conditions we observed in the visualisation sequences: designers struggled to verbalise their
ideas, occasionally ceasing vocalisation altogether, and at the same time directed their gaze,
not towards the listener, but to the work surface and the sketches they were creating. From
their perspective, gaze avoidance helps them cope with the cognitive demands imposed by
visual problem solving. From the listener’s perspective, the visualiser’s averted gaze signals
their temporary non-availability for interaction. Under these conditions, turn transitions are
likely to be solicited by explicit displays of recipiency by the visualiser.

Implications for the design of virtual workspaces


These observations have several implications for the design of virtual workspace
technologies. Firstly, the finding that only one quarter of all speech units observed here
required access to a shared workspace surely suggests that only a very small proportion of the
activities of design teams will require the processing power and advanced display
technologies currently under consideration for shared workspace systems. This does not, of
course, mean that such systems can dispense with support for real-time communication. The
highly interactive non-visual interchanges that we observed clearly need sufficient bandwidth
to carry synchronous group-wide communication, though the evidence accumulated over the
last twenty years points overwhelmingly to the adequacy of high-fidelity audio connectivity,
rather than more costly video connectivity, for this purpose.
262 F Reid, S Reed and J Edworthy

Other categories of interaction in the present study, however, are likely to depend on high
levels of video connectivity. Just how much connectivity is desirable depends on the type of
design reasoning in which the team is engaged. The majority of the workspace actions
accompanying speech in the present study involved producing and pointing to sketches, and this
can readily be supported by an audio link coupled to a “virtual sketchbook”, such as VideoDraw
(Tang and Minneman, 1990)—a system that allows concurrent access to a drawing surface over
which hand gestures and sketches by different designers can be superimposed, and high levels
of interaction supported. It is important to note that over one quarter of the non-visual arguments
observed in the present study involved some form of visible workspace activity—mainly pointing
and gesturing—for which this level of video connectivity is likely to be sufficient.
However, the visualisation sequences we observed clearly require the additional capability
to monitor the direction of a colleague’s gaze and therefore his or her focus of attention. In
order to establish the recipiency of the visualiser and avoid unwarranted interruptions, listeners
need visual access to the shared drawing surface and to a view of the visualiser that provides
information on their attentional focus. This could, of course be achieved using a three-quarters
video viewpoint that encompasses both the person and the workspace. But the visualiser needs
at the same time to signal turn completion using familiar cues, including gazing directly at the
listener. Here it might reasonably be thought that the “seamless” ClearBoard workspace developed
by Ishii (eg. Ishii et al, 1995) would provide the unimpeded access to work and interpersonal
spaces necessary for effective collaboration. This system allows designers working from remote
locations simultaneously to draw on, and talk through, a virtual “glass window” onto which a
head-and-shoulders video image of the partner is superimposed. Designers can not only see and
manipulate the same images on the drawing surface, but can also see each other and what they
are doing. One of the advantages claimed for this system is that less eye and head movement is
needed to switch focus between the drawing surface and the partner’s face, making eye-contact
easier to establish. However, it also makes it more difficult to avoid. Our present findings imply
that any enhancement of the shared workspace which prevents designers from visibly disengaging
from their colleagues is likely to perturb not only the process of visualising design ideas, but
also the subtle and precise coordination between speech and eye gaze associated with the
production and exchange of speaking turns. Careful empirical investigation of design interactions
in these environments is now needed to evaluate these conclusions.

References
Beattie, G.W. 1979, Contextual constraints on the floor-apportionment function of gaze in
dyadic conversation, British Journal of Social and Clinical Psychology, 18, 391–392
Clark, H.H. and Brennan, S.E. 1991, Grounding in communication. In L.B.Resnick,
J.M.Levine and S.D.Teasley (eds.) Perspectives on Socially Shared Cognition,
(American Psychological Association, Washington), 127–149
Goldschmidt, G. 1995, The designer as a team of one, Design Studies, 16, 189–209
Goodwin, C. 1981, Conversational Organization: Interaction Between a Speaker and a
Hearer, (Academic Press, London)
Ishii, H., Kobayashi, M. and Grudin, J. 1995, Integration of interpersonal space and shared
workspace: ClearBoard design and experiments. In S.Greenberg, S. Hayne and R.Rada
(eds.) Group-ware for Real-Time Drawing: A Designer’s Guide, (McGraw-Hill,
London), 96–125
Tang, J. and Minneman, S. 1990, VideoDraw: A video interface for collaborative drawing. In
Proceedings of the ACM/SIGCHI Conference on Human Factors in Computing, (ACM
Press, New York), 313–320
Tang, J.C. 1991, Findings from observational studies of collaborative work, International
Journal of Man-Machine Studies, 34, 143–160
DESIGN AND
USABILITY
PLEASURE AND PRODUCT SEMANTICS

Patrick W.Jordan

Senior Human Factors Specialist, Philips Design, Building W, Damsterdiep


267, P.O. Box 225, 9700 AE Groningen, The Netherlands

Alastair S.Macdonald

Course Leader, Product Design Engineering, Glasgow School of Art, 167


Renfrew Street, Glasgow G3 6RQ, Scotland

Human factors has tended to focus on pain. As a profession, it has been very
successful in contributing to the creation of products that are safe and usable
and which, thus, spare the user physical, cognitive and emotional
discomfort. However, little attention seems to have been paid to the positive
emotional and hedonic benefits—pleasures—that products can bring to their
users. This paper examines the relationship between product semantics and
pleasure in use, within the structure of the ‘Four Pleasure Framework’.
Studies such as this represent human factors’ first steps towards establishing
links between product properties and the types of emotional, hedonic and
practical benefits that products can bring to their users.

Introduction
“I can sympathise with other people’s pains, but not with their pleasures. There is something
curiously boring about someone else’s happiness.” This quote comes from Aldous Huxley’s
1920 Novel, ‘Limbo’ however, it could almost be a motto for ergonomics. Ergonomics
journals, conference proceedings, and textbooks seethe with studies of pain: back pain, upper
limb pain, neck pain, pain from using keyboards, pain from using industrial machinery, pain
from hot surfaces—these are just a few examples from studies that have been presented at
recent Ergonomics Society Conferences.
As a discipline, ergonomics has been focused on eliminating pain, whether it be in the
form of physical pain, as in the examples above, or the cognitive/emotional discomfort that
can come from interacting with products that are difficult to use. Meanwhile, the idea that
products could actually bring positive benefits—pleasures—to users, seems to have been
largely ignored. So, whilst ergonomics has had a great deal to offer in terms of assuring
product usability and safety, it seems to have had very little to contribute in terms of creating
products that are positively pleasurable.
The case for ergonomists to take the lead in addressing the issue of pleasure with products
has been made elsewhere (Jordan 1997a) and a framework for approaching the issue—the four
pleasures—has been proposed (Jordan 1997b). In this paper, this framework will be summarised
and illustrated with examples that show the relationship between pleasure and product semantics.
Pleasure and product semantics 265

Pleasure with Products


Pleasure with products is defined as: “…the emotional, hedonic and practical benefits
associated with products.” (Jordan 1997a)

The Four Pleasures


The four pleasure framework was originally espoused by Canadian anthropologist Lionel
Tiger (Tiger 1992) and subsequently adapted for use in design (Jordan 1997b). The
framework models four conceptually distinct types of pleasure—physio, socio, psycho and
ideo. Summary descriptions of each are given below with examples to demonstrate how each
of these components might be relevant in the context of products.

Physio-Pleasure
This is to do with the body—pleasures derived from the sensory organs. They include pleasures
connected with touch, taste and smell as well as feelings of sexual and sensual pleasure. In the
context of products physio-p would cover, for example, tactile and olfactory properties. Tactile
pleasures concern holding and touching a product during interaction. This might be relevant to,
for example, the feel of a TV remote control in the hand, or the feel of an electric shaver against
the skin. Olfactory pleasures concern the smell of the new product. For example, the smell
inside a new car may be a factor that effects how pleasurable it is for the owner.

Socio-Pleasure
This is the enjoyment derived from the company of others. For example, having a conversation
or being part of a crowd at a public event. Products can facilitate social interaction in a number
of ways. For example, a coffee maker provides a service which can act as a focal point for a
little social gathering—a ‘coffee morning’. Part of the pleasure of hosting a coffee morning
may come from the efficient provision of well brewed coffee to the guests. Other products may
facilitate social interaction by being talking points in themselves. For example a special piece of
jewellery may attract comment, as may an interesting household product, such as an unusually
styled TV set. Association with other types of products may indicate belonging in a social
group—Porsches for ‘Yuppies’, Dr. Marten’s boots for skinheads. Here, the person’s relationship
with the product forms part of their social identity.

Psycho-Pleasure
Tiger defines this type of pleasure as that which is gained from accomplishing a task. It is the
type of pleasure that traditional usability approaches are perhaps best suited to addressing. In
the context of products, psycho-p relates to the extent to which a product can help in
accomplishing a task and make the accomplishment of that task a satisfying and pleasurable
experience. For example, it might be expected that a word processor which facilitated quick
and easy accomplishment of, say, formatting tasks would provide a higher level of psycho-
pleasure than one with which the user was likely to make many errors.

Ideo-Pleasure
Ideo-pleasure refers to the pleasures derived from ‘theoretical’ entities such as books, music
and art. In the context of products it would relate to, for example, the aesthetics of a product and
the values that a product embodies. For example, a product made from bio-degradable materials
might be seen as embodying the value of environmental responsibility. This, then, would be a
potential source of ideo-pleasure to those who are particularly concerned about environmental
issues. Ideo-pleasure would also cover the idea of products as art forms. For example, the video
266 PW Jordan and AS Macdonald

cassette player that someone has in the home, is not only a functional item, but something that
the owner and others will see every time that they enter the room. The level of pleasure given by
the VCR may, then, be highly dependent on how it affects its environment aesthetically.

Product Semantics
Product semantics refers to the ‘language’ of products and the messages that they
communicate (Macdonald 1997). Product language can employ metaphor, allusion, and
historical and cultural references, whilst visual cues can help to explain the proper use or
function of a product. What follows are a number of examples, demonstrating the link
between product semantics and pleasure with products.

Karrimor’s Condor Rucsac Buckle


This side release buckle closes with a very positive ‘click’. Visual, audio, and tactile feedback
combine to ensure that the rucsac looks, feels and sounds good. These physio-pleasures are
part of projecting the benefit of a reliable and reassuring fastening.

Global Knives
Global knives are a new concept in knives, designed and made in Japan. The blades are made
from a molybdenum/vanadium stainless steel and are ice tempered to give a razor sharp edge.
The integral, hollow handles are weighted to give perfect cutting balance with minimum pressure
required. The comfort of the knife in the hand and the aesthetic sensation of the finely balanced
weight are both physio-pleasures. Because of their smooth contours and seamless construction,
the knives allow no contours for food and germs to collect and thus are exceptionally hygienic.
This provides the user with a feeling of reassurance—a psycho-pleasure.

NovoPen™
Traditionally, those suffering from diabetes had to use clinical looking syringes and needles.
The NovoPen™ is a device for the self-administration of precise amounts of insulin. Its
appearance is rather like that of a pen—this provides a more positive signal than that of the
hypodermic syringe, which is coloured through medical and drug abuse associations. This
offers the user both ideo- and socio-pleasure, by playing down any stigma that the user and
others may associate with syringes and/or the medical condition. The NovoPen™ also
incorporates tactile and colour codes which refer to the different types of insulin dosage that
may be required. These provide sensory back up and contribute to the product’s aesthetic
profile. The technicalities of administering precise dosages have been translated into easy
human steps and a discrete but positive click occurs when the dose is prepared for delivery.
This provides the psycho-pleasure of reassurance to the user in what might otherwise be a
rather daunting task. Finally, the NovoPen™ also provides physio-pleasure through its tactile
properties—the pen is shaped to fit the hand comfortably and the surface texture, achieved
through spark erosion, is pleasant to the touch.

Samsonite Epsilon Suitcase


Journeys through air terminals can be fraught with stress—both physical and psychological.
The three handle options on the Samsonite Epsilon Suitcase provide a number of comfortable
options for lifting, tilting or trailing. The handle material is a non-slip rubberised coating which
does not become sweaty or slippy in use. These features, providing physio-p, reduce the stress
associated with the situation. The design of the suitcase’s coasters allow a controllable and
responsive movement in the ‘trailing’ mode, the suitcase ‘obeying’ the needs of the user, providing
a degree of psycho-p over other suitcases who do not obey their owners’ will.
Pleasure and product semantics 267

Mazda Car Exhaust


‘Kansei Engineering’ is a term coined by Nagamachi (1995) for turning emotions into
product design, and has been extensively employed in automotive design. The Mazda team
has engineered the sound emitted by the MX5 Miata to evoke association with classic
(British?) sports cars, satisfying ideo-p (macho, youthful associations), and socio-p (I have
arrived, have I not?!).

Table 1 gives a summary of the benefits associated with the products within the context of the
four pleasure framework.

Table 1. Four pleasure analysis of the benefits associated with the example products.

Designing Pleasurable Products


The examples given have demonstrated that, through their semantics, products can provide
different types of pleasure to their users. Even from this little selection, it is clear that
products can bring practical, emotional and hedonic benefits to users which go beyond those
associated with concepts such as ergonomic design and usability.
The four pleasure framework gives a useful structure within which to approach the issue
of pleasure with products. In particular, it has proved useful at the beginning of the product
creation process, as a vehicle for discussion and agreement between human factors, design,
product management, marketing, engineering and market research, as to what the main
benefits delivered by a product should be. These agreements lead to the unity of purpose, that
is so important to creating products that deliver clear benefits and tell a clear ‘story’. For
example, Jordan (in preparation) describes a set of benefits that might be agreed for a new
photo camera, with young professional women as the target group. They are summarised in
table 2.
Having agreed that these are the benefits to be delivered, the entire product development
team can then concentrate on these. Having these common aims in mind when developing the
technology, design and marketing material for a product, ensures that all disciplines are
working to a common goal.
268 PW Jordan and AS Macdonald

Conclusions
A number of examples have been given, illustrating links between product semantics and
pleasure in use. This was based on a qualitative analysis within the context of the Four
Pleasure Framework. Because of its simplicity and accessibility, this framework is a useful
tool, suited to the multi-disciplinary nature of product development. It supports constructive,
focused and progressive co-operation to move the design along. Such approaches enable
human factors to move beyond usability to support the creation of products that are a positive
pleasure to use—products that will delight the customer.

Table 2. Four Pleasure Analysis of product requirements for a camera aimed at young
women of high socio-economic status (from Jordan, in preparation).

References
Jordan, P.W., 1997a, Putting the pleasure into products, IEE Review, November 1997,
249–252
Jordan, P.W., 1997b, A Vision for the future of human factors. In K.Brookhuis et al. (eds.)
Proceedings of the HFES Europe Chapter Annual Meeting 1996, (University of
Groningen Centre for Environmental and Traffic Psychology), 179–194
Jordan, P.W., in preparation, The four pleasures—human factors for body, mind and soul,
submitted to Behaviour and Information Technology
Macdonald, A.S., 1997, Developing a qualitative sense. In N.Stanton (ed) Human Factors in
Consumer Products, (Taylor and Francis, London), 175–191
Nagamachi, M., 1995. The Story of Kansei Engineering, (Kaibundo Publishing, Tokyo)
Tiger, L., 1992, The Pursuit of Pleasure, (Little, Brown and Company, Boston)
A SURVEY OF USABILITY PRACTICE AND NEEDS
IN EUROPE

Martin Maguire and Robert Graham

HUSAT Research Institute


Loughborough University
The Elms, Elms Grove
Loughborough, Leics. LE11 1RG
Tel: +44 1509 611088, Fax: +44 1509 234651
m.c.maguire@Lboro.ac.uk, r.graham@Lboro.ac.uk

A survey was carried out to determine the state of current usability practice
in Europe to assist with the dissemination of usability information and
services to industry and EC projects. The survey shows that organisations
are aware of the need for usability in the design process and carry out
relevant usability activities, although the extent to which they are performed
may vary e.g. for bespoke systems versus ‘off-the-shelf’ products. The paper
also reports on current usability activities at different stages in the design
process, possible methods for enhancing usability activities, and
requirements for usability information.

Introduction and Aims


The human factors community has produced a wide range of methods and guidelines that can
assist the system development process for the benefit of the end users. To ensure that these
potential human factors inputs are used effectively, it is important to consider the views of
companies and organisations that might receive them. An earlier survey carried out by Dillon
et al (1993) found that the greatest need is for usability methods which fit in with the
development lifecycle. This paper looks at the specific needs for usability support within the
lifecycle in more detail.
HUSAT is part of a project called INUSE (Information Engineering Support Centres),
funded by the European Commission (EC) which has set up a network of usability support
centres to provide usability advice to industry and to the Telematics community (INUSE,
1997). As part of the project, it was planned to survey current usability practices within
European organisations, the needs for usability support and the form that such support should
take. The results of the survey are presented under the following headings:

• Scale of usability activities performed


• Usability information and support currently employed
• Comparison of bespoke system with ‘off-the-shelf’ packages
• Usability activities in the design process
• Methods of incorporating usability into system design
• Types of information to assist with usability activities
• Methods of disseminating information about usability

Survey Sample
The survey data was collected via a self completion questionnaire. The population sample
was drawn from existing company contacts, delegates at EC Telematics project concertation
270 MC Maguire and R Graham

meetings, the UK HCI Conference, and a Ministry of Defence suppliers exhibition in the UK.
Based on a target number of 140 individuals, 65 responses (a rate of 46%) were received. As
the responses are drawn from those people sufficiently interested to complete the survey, the
results may tend to show a generally positive attitude towards usability activities. Arguably
however, such respondents are in a good position to comment on their organisations’ own
usability practices.
Of the 65 respondents, most came from industrial companies (40), with others
representing government departments (14), academic and industrial research centres (11).
The majority of the respondents were from the UK (37), but nine other European countries
were represented, including Greece, Belgium, Germany, the Netherlands, Denmark, Italy,
Sweden, Spain and Norway. Respondents came from a range of occupations. Many stated
that their principal role within their organisation was project manager (25). Other common
roles were human factors specialists (13) and designers or software engineers (11). There
were also representatives from marketing, system procurement, R&D, user representatives in
a design team and quality assurance. The organisations produced a range of systems e.g.
communications, management, financial and public information systems. They varied in size
from small groups of less than 10 employees to large multi-national companies.

Results

Scale of usability activities performed


A question was asked about the scale of usability activities carried out within the
organisation. In response, 18% stated that they carried out little or no such activity, 60% said
that they carried out usability work on a small scale, while 22% stated that they carried it out
on a large scale.

Usability information and support currently employed


The different types of usability advice incorporated are shown in Table 1 below, in order of
frequency. It is interesting to note that general literature sources are rated more highly than
standards. Other sources of information are only used by about half of the organisations or less.

Table 1. Current use of usability information and support

Comparison of bespoke system with ‘off-the-shelf’ packages


Over half of the organisations who responded to the questionnaire (55%) designed custom or
bespoke products for a single client. About a third of the organisations (34%) designed mass
market or ‘off-the-shelf’ packages. The difference between the two classes of product are
shown in Figure 1). Around 90% of developers of bespoke products discussed requirements
with the customer or purchaser, while only 50% of the developers of off-the-shelf systems did
the same. Few ‘off-the-shelf’ system developers (18%) involve end-users in the design team
while considerably more (42%) bespoke system developers do so. These differences are
perhaps less surprising bearing in mind that for mass market products, there is typically no
specific customer or set of users to focus on. It is therefore be desirable to offer such
companies clearer advice about how to set up representative customer panels and to recruit
users from the outside world with relevant characteristics.
Survey of usability practice and needs in europe 271

Figure 1. Comparison of usability activities for bespoke and ‘off-the-shelf’ products

Usability activities in the design process


One section of the survey considered different usability activities within the design process.
For each activity, respondents could select one from a given set of responses to indicate how
fully each activity was considered and carried out. The activities included:

1) Getting consideration of usability into the initial contract for the system.
2) Management of usability activities during the project.
3) Gaining access to the right users at the right time.
4) Analysing the user and task characteristics and organisational setting.
5) Collection/specification of the user and organisational requirements.
6) Iterative development and evaluation of prototype solutions.
7) Evaluation of the final product with users.
8) Field trials of the product in use.
9) User support of the product after purchase.

For most of the above activities, between 44% and 51% of respondents stated that they felt
they were carried out fully whether part of a quality plan or not. However for activities 3,
‘Gaining access to the right users at the right time’ and 4, ‘Analysing the user and task
characteristics and organisational setting’ only 37% of respondents felt they were considered
properly. Others stated that the task had not been carried out fully, that they had too little
knowledge to consider it properly, or even that it was not seen as important by management or
staff. These results indicate the need for greater dissemination of usability context analysis
techniques (Bevan and Macleod, 1994). Also the problem of gaining access to users to make
inputs into the design process or to act as subjects within usability evaluations should be
considered and planned early on in the lifecycle, if the project is to have representative user
input.
Perhaps the most fundamental activity listed above is 1, ‘Getting consideration of
usability into the initial contract for the system’. While 51% of respondents felt that this was
272 MC Maguire and R Graham

carried out fully, only 12% incorporated it into a quality plan. This indicates the need to
consider linking usability and quality planning more closely. It is also of interest to note that
for activity 9, ‘User support after purchase’, although 46% stated that they carry it out fully, a
high proportion (24%) felt that they had too little knowledge to consider it properly or that it
was not seen as of major importance.
Taking the results as a whole, it seems that the majority of organisations do see the need
for, and do perform, relevant usability processes during the design lifecycle. However for
most of the activities less than half do so fully, indicating the need for more support and
information.

Methods of incorporating usability into system design


Subjects rated various ways in which usability could be incorporated or improved in their
company, on a scale from 1 (not useful) to 5 (essential). Table 2 below, shows the mean
ratings given to each method.

Table 2. Methods of incorporating usability

Respondents rated highly the idea of training designers to help themselves, and providing
access to users and tools to help them carry out usability processes. There appears less interest
in employing external usability specialists although this is partly due to respondents having
usability expertise in-house. Analysing these results further, it was found that companies
developing ‘off-the-shelf’ systems feel more strongly about the need for external expertise (mean
rating 3.6 versus 2.5 for bespoke developers). This may be because the user population being
designed for, and the tasks that users carry out, are less well defined. It was also found that
companies who carry out usability activities on a large scale feel a greater need for improved
prototyping tools than those where it is on a smaller scale. They requested both (i) better
prototyping tools, and (ii) usability knowledge to be built into the tools. This may be seen by
developers as a cost effective way of inputting usability into the design lifecycle.

Types of information to assist with usability activities


Respondents were asked to rate the types of information that might assist with usability
within the system development process. The mean ratings given in Table 3 below (on the
scale: 1=not useful to 5=essential) show that respondents are most interested in information
to supplement processes e.g. evaluation, prototyping and user requirements specification.

Table 3. Types of information to assist usability


Survey of usability practice and needs in europe 273

Topics for guidelines on interface design were also investigated. It was found that respondents
seemed to find guidelines on the broader issues such as navigation and screen layout more
useful than interface specifics such as using colour or icons. Again, this may reflect the fact that
designers may be tied to a particular style, limiting their choice of available colour or icons.
Alternatively, it may be that issues such as user navigation are seen as more complex, and
designers need further guidance or knowledge than they already possess.

Methods of disseminating information about usability


Finally, respondents rated the various methods by which usability information could be
received. These are shown below (Table 4) in order of preference, the scores based on a scale
of 1=very poor to 5=excellent.

Table 4. Methods of disseminating usability information

Respondents felt that training is a good method of receiving usability information. Another
highly rated approach was to build usability principles into the software design tools
themselves. However, general usability guidelines, whether in the form of handbooks, video
or CD-ROM were seen as slightly less useful, perhaps as they require more effort to apply.

Conclusions
The findings seem to show that many organisations do carry out usability activities although
they may not be on a large scale or fully across the design lifecycle. It was found that ‘off-the-
shelf’ developers do not involve potential customers and end users as much as bespoke
developers, and so more advice to support this activity could be offered. Similarly a greater
awareness of usability standards could increase their usage, and more help on analysing the
user context is also needed. It is recommended that agreement to gain access to potential
users is made at an early stage to ensure good user representation. Finally, the approach of
training companies seems to be one of the most attractive options for integrating human
factors knowledge into the lifecycle, while the idea of software design tools incorporating
usability principles would also be of interest to organisations in the future.

Acknowledgement: Project IE 2016 INUSE, is funded by the European Commission’s


Telematics Applications Programme.

References
Bevan, N. and Macleod, M. 1994, Usability measurement in context, Behaviour and
Information Technology 13(1–2), 132–145, Jan–Apr 1994, London: Taylor & Francis.
Dillon, A., Sweeney, M. and Maguire, M. 1993, A survey of usability engineering within the
European IT industry—Current needs and practices, Proceedings of the HCI ‘93
Conference, J.Alty, D.Diaper and S.Guest, (Eds), Cambridge Univ. Press, People and
Computers VIII, Loughborough, pp. 81–94., Sept 1993.
INUSE 1997, see http://www.npl.co.uk/inuse
CULTURAL INFLUENCE IN USABILITY ASSESSMENT

Alvin Yeo Robert Barbour Mark Apperley

Computer Science Dept. Science, Mathematics & Computer Science Dept.


University of Waikato, Technology Education University of Waikato,
Private Bag 3105 Research Centre Private Bag 3105
Hamilton, New Zealand University of Waikato, Hamilton, New Zealand
Hamilton, New Zealand

A study was conducted in Malaysia to identify cultural factors that may


affect results of usability assessment techniques. The usability evaluation
techniques used in this study were “think aloud”, System Usability Scale and
interviews. The results indicate that cultural factors such as power distance
and language might influence responses in the different usability
assessments. Recommendations on how to reduce the cultural effects are
also reported.

Introduction
Results of usability tests conducted in the domestic market may not be valid internationally
(Nielsen, 1990; Fernandes, 1995). The software must be tested in the target market. Testing is
necessary to ensure that the software is acceptable and does not cause offence to the target
community. As more software is marketed globally, software developers have to take into
account cultural issues that may impact usability testing. Methods that work in Western
cultures, e.g. the United States (US), may not work in other cultures (Herman, 1996).
Fernandes (1995) describes lessons learnt by Claris Corporation in Japan:

• “Questions regarding how comfortable or how much they “like” the product were
removed because they involved feeling and emotion which are issues that the
Japanese are not accustomed to responding to.
• Japanese women spoke very softly. This puts a huge premium on the quality of
the microphone and where it was placed.
• Co-discovery techniques were used but they became problematic when people of
differing status were put in the room together. In particular, women were found to
talk very little when they were paired with a man.” (Fernandes, 1995).

In Singapore, a subject actually broke down and cried during a software evaluation session
(Herman, 1996). However, during the post-test interview, the subject was very positive about
the software. This behaviour is believed to be attributed to the Eastern culture whereby it is
“considered culturally unacceptable to criticise the designer directly or openly, as this may
cause the designers to lose face” (Herman, 1996).
Usability assessment is conducted to improve the usability of the software. Thus, if results
of usability assessments are misconstrued, the software’s success might be compromised. It
is crucial that lessons from the above experiences be taken into account to ensure that
Cultural influence in usability assessment 275

accurate, reliable and valid results can be drawn from the usability assessments. These issues
are especially important given that most, if not all, usability assessment techniques originate
from the West and are used in the East—one of the fastest growing and potentially big
software markets (Software Publishers Association, 1996). There exist few studies of how
these usability evaluation techniques fare in Eastern countries. This lack of literature is
surprising as US (which is the biggest exporter of software in the world) earns more that half
of its revenue from outside the US. The more information we know about the target users, the
greater the likelihood of success of the software. As highlighted in the examples, studies of
potential problems of the use of these Western techniques in the East are needed to ensure
more accurate results from usability evaluation.

Research Aims
The aims of this research were to: identify the cultural factors that affect usability testing,
examine how these cultural factors affect usability testing, and identify ways to improve usability
testing by reducing the cultural effects. The study was conducted in Malaysia, one of the fastest
growing markets in Asia. Results from this Malaysian study may be indicative of results in
other Asian countries as Malaysia shares similar cultural attributes with its neighbours.

Method
To identify the cultural factors that may affect usability testing, data was collected from an
experiment using three methods of usability assessment. The three usability assessment
methods were “think aloud” (described as probably the most valuable usability engineering
method (Nielsen, 1993)), the System Usability Scale (SUS) and the interview method. The
SUS is identified as a “quick and dirty” method to gauge the users’ response to the interface’s
usability (Brooke, 1995). Although the SUS is only ten questions long, it correlates well with
SUMI (Holyer, 1994). A University of Cork study placed SUS correlation reliability of
0.8588 with SUMI (Holyer, 1994).
In the experiment, seventeen experienced spreadsheet users who were staff members of a
Malaysian university completed a set of tasks in a spreadsheet with a Bahasa Melayu
(Malaysia’s national language) interface. All the subjects had used Excel 5.0 in their work.
Their reported spreadsheet-use ranged from at least one hour a week to four hours a day.
Experienced spreadsheet users were procured, as we believed that they would be able to
transfer their knowledge of one spreadsheet to another. An effort was made to select subjects
from the different levels of the organisation—the subjects’ occupations ranged from clerks to
managers. This diversity was to allow for a representative perspective from the different
levels of the organisation and also to take into account the status difference findings of
Fernandes (1995) above. The higher status subjects included managers, lecturers and tutors
whereas lower status subjects included clerks and administrative assistants.
The experiment was conducted by one of the authors who is a Malaysian and a tutor.
During the experiment, the users were required to “think aloud” and the session tape-
recorded. Also, the subjects’ interactions (keystrokes) were recorded by a logging algorithm
in the spreadsheet. Once the users had completed the tasks, they were asked to evaluate the
spreadsheet’s usability by filling in a SUS survey form. After completing the SUS, the
subjects were also interviewed to obtain their opinions relating to the spreadsheet they had
just used. The log files were used to assist the transcription of the “think aloud” sessions. The
interviews were also transcribed. Data from all three sources were aggregated. A general
framework of grounded theory was used in the data analysis to identify cultural factors that
may exist in the experiment.
276 A Yeo, R Barbour and M Apperley

Results and Analysis


The transcription of the “think aloud” and interview sessions of all seventeen subjects were
examined. The subjects’ responses in the System Usability Scale were also scored. Overall,
all the subjects had problems completing the tasks in the spreadsheet as they had to contend
with the Bahasa Melayu interface as well as the fact that the spreadsheet was a DOS
spreadsheet which did not support mouse-use.
As the spreadsheet used was a DOS spreadsheet, a lot of negative comments were
expected of the spreadsheet especially as all the users had been using Excel 5.0. It was
observed that the subjects that were higher status (than the experimenter) were “harsher” or
more frank in their comments (note that italicised English comments in this paragraph are
comments that were translated from Bahasa Melayu). The negative comments made by the
lower status (compared to the experimenter) were more subtle. Some of the negative
comments made by nine of the ten higher-ranked subjects about the spreadsheet include
“…old fashioned” [S4—subject identified as S4 in the study], “Excel is easier” [S8], “…very
outdated” [S10], “I was taken aback it wasn’t that friendly I have to be frank with you” [S14],
“difficult/complicated” [S15], “I don’t like it…it’s difficult” [S18], “…system still at a very
primitive level” [S20], “For a beginner, it’s quite difficult for them to learn this…” [S21], “I
think it’s really difficult to learn” [S22]. However, only two of the seven lower-ranked
subjects made negative comments: “not user friendly” [S2] and “I think Excel is easier”
[S11]. The other comments made by remaining five lower-ranked participants were mainly
positive: “…not that bad a utility” [S3], “I think it’s okay… I think we better use this in our
[office]” [S7], “I feel it’s more effective if we use this.” [S12], “The spreadsheet is good…to
umm…replace Excel…” [S13], “I believe it…if we learn…it’s easier to use” [S19].
From the above comments, it would appear that the lower status subjects were more
positive and more receptive of the spreadsheet. This result also correlates with the SUS scores
whereby the lower status subjects’ scores were significantly greater than higher status
subjects’ scores (t-test at 5% significant level) i.e. lower status subjects rated the
spreadsheet’s usability more favourably than the higher status subjects. Although all the
subjects had problems with the DOS spreadsheet, it would seem the lower status subjects
“liked” the spreadsheet more than higher status subjects.
The above results suggest that one possible cultural factor that may affect the results of the
usability assessment is power distance. Power distance is “the extent to which the less
powerful members of institutions and organisations within a country expect and accept that
power is distributed unequally.” (Hofstede, 1994). From a sample of 50 countries, Hofstede
(1994) identified Malaysia as the country having the highest power distance. This means that
Malaysians in general are willing to accept the fact of inequality in power as being normal.
The power holder’s authority is unquestioned. Subordinates that do so would be seen
improper and disrespectful on their part (Abdullah, 1996). Furthermore, in a high power
distance country, employees are “afraid” of their employers as employers wield powers such
as the authority to fire employees. Thus, a person of higher status and power (e.g. a manager)
will be more likely to voice his or her feelings of discontent to a person of lower ranked (e.g.
subordinate). However, the reverse is not true. A person of lower status is unlikely to go
against a higher ranked person for fear of retribution.
The experiment results can be explained with respect to the power distance characteristic.
The lower status subjects probably did not like the spreadsheet any more than the higher
status subjects. However, the lower status subjects were more positive about the spreadsheet
as they did not want to question or “go against” the experimenter (a tutor, considered high
status/power holder), maybe for fear of retribution or of appearing disrespectful to the
experimenter. The lower status subjects thus were less critical and “less honest” in their
Cultural influence in usability assessment 277

responses, to the extent of suggesting the use of the DOS spreadsheet in the office and also
replacing Excel with the DOS spreadsheet. On the other hand, the subjects of higher status
were more likely to voice their dissatisfaction as these subjects were of the same status or
higher in the organisation hierarchy (compared with the experimenter) and had little fear of
retribution. This observation is supported by the frank comments made by the higher status
subjects that the spreadsheet was “old fashioned”, “very primitive” and “very outdated”.
Another interesting observation is that the subjects, in their “think aloud” sessions and
interviews, used predominantly English interspersed with Bahasa Melayu (this excludes any
reference to parts of the software with prompts or commands in Bahasa Melayu). This
observation is probably due to the fact that Malaysians are bilingual (speaking both English
and Bahasa Melayu). As such, if an experimenter fluent in both languages is used, the
subjects would be able to “think aloud” or respond in their preferred language. Responses in
the native tongue are more accurate as the subjects are more likely to describe usability
problems clearer in their native language than in another language. Forcing a subject to use a
language other than their native tongue might impose further cognitive load which may in
turn affect techniques such as the “think aloud” session during which a subject is required to
complete tasks as well as think aloud at the same time. For example, subject S12 whose
native tongue is Bahasa Melayu attempted to use English in the “think aloud” session. S12’s
speech was hesitant and more information may have been garnered if S12 had spoken in his
native language.
Furthermore, the ability of an experimenter to communicate in the native tongue of the
subjects may assist in forming a better rapport. This view is supported by Morais (1997) who
found that in Malaysia, top management used more Bahasa Melayu to reduce the status gap
and to reach out to their employees while the workers used more English.

Recommendations
One implication of the power distance finding is that in order to obtain an honest and accurate
appraisal of a software, an experimenter of lower status than the subject may be required.
Otherwise, if the the experimenter is a manager (higher status) and the subject/usability
assessor is a clerk (lower status), an accurate (honest) assessment might not result as the clerk
may say what the manager wants to hear for fear of offence or retribution.
Linguistically, it would be ideal to have usability testers fluent in the language of the users
as this provides better opportunities for the users to speak their native tongue. Forcing
subjects to think aloud in a language other than their native tongue might impose further
cognitive load. Furthermore, an experimenter who is fluent in the subject’s native tongue may
be able to form better rapport with the subject more quickly and this may aid in the getting
better results in the usability study.

Conclusion
The status factor appears to influence the results of the usability assessment techniques
whereby higher status subjects were more honest than the lower status subjects in their
usability assessment responses. To ensure more honest (accurate) responses, the status of the
experimenter should be equal or lower than the status of the subjects.
Furthermore, an experimenter who is fluent in the subject’s native language may be able
to elicit more information if the subject is able to communicate in their native language. This
factor is of importance in techniques where verbal responses are required e.g. the “think
aloud” and interviews.
278 A Yeo, R Barbour and M Apperley

Further Work
Two other factors that may influence the result of the usability evaluation were also
examined. The factors were the gender of the subjects and the familiarity of the subject with
the experimenter. Although gender does not seem to play any role in our study, initial results
indicate that a subject who is familiar with the experimenter may be more honest in the
evaluation. Further investigation of these factors is being carried out.

Acknowledgments
The authors would like to thank Borland International for permission to use the TCALC
spreadsheet source code in their research.

References
Abdullah, A. 1996, Going Glocal, (Malaysian Institute of Management, SNP Offset, Shah
Alam, Malaysia)
Brooke, J.B. 1995, SUS—A Quick and Dirty Usability Scale, In P.Jordan (ed.) Usability
Evaluation in Industry, (Taylor and Francis)
Fernandes, T. 1995, Global Interface Design. (Academic Press, Chestnut Hill, MA)
Herman, L. 1996, Towards Effective Usability Evaluation in Asia: Cross-cultural
Differences. In J.Grundy and M.Apperley (eds.) Proceedings of OZCHI’96: Sixth
Australasian Computer Human Interaction Conference, Hamilton, New Zealand,
Nov., 1996, (IEEE), 135–136
Hofstede, G. 1994, Cultures and Organizations: Software of the Mind, Paperback Edition.
(HarperCollins, Glasgow)
Holyer, A. 1994, Methods for Evaluating User Interfaces, Cognitive Science Research Paper
No. 301, School of Cognitive and Computing Sciences, University of Sussex, Available
as http://www.cogs.susx.ac.uk/cgi-bin/htmlcogsreps?csrp301
Morais, E. 1997, Talking in English But Thinking Like a Malaysian: Insights from a Car
Assembly Plant. In Proceedings of English Is An Asian Language: The Malaysian
Context Symposium, Kuala Lumpur, Malaysia, (Forthcoming)
Nielsen, J. 1990, Usability Testing of International Interfaces. In J.Nielsen (ed). Designing
User Interfaces for International Use, (Elsevier)
Nielsen, J. 1993, Usability Engineering, Paperback Edition, (Academic Press)
Software Publishers Association. 1997, Asia Pacific Application Software Sales Rise 22% in
1996, Available as http://www.spa.org/research/releases
INTERFACE DESIGN
INTERFACE DISPLAY DESIGNS BASED ON
OPERATOR KNOWLEDGE REQUIREMENTS

Fiona Sturrock and Barry Kirwan*

Industrial Ergonomics Group,


School of Manufacturing & Mechanical Engineering,
University of Birmingham,
Birmingham, B15 2TT, United Kingdom.

This study was aimed at designing interface displays which support different
types of knowledge for a specific nuclear power plant scenario. An in-depth
and exhaustive scenario exploration revealed that the current interfaces did
not sufficiently support the operators’ knowledge requirements for the
scenario, and much vital information could be considered inert. Interface
redesign recommendations were made on the basis of the required type of
knowledge, and as a result several new interfaces were designed which
would aid in the re-activation of the inert knowledge. These interfaces were
designed to be selectively called-up via icons on the original display and are
aimed at specifically supporting operators during diagnosis of disturbances.

Introduction
Diagnoses of disturbances, in complex and dynamic environments such as nuclear power
plants (NPP), is considered to constitute one of the greatest cognitive demands to be
experienced by the human operator (Vicente, 1995; Woods et al, 1994). However the operator
is not expected to perform unaided. The role of the interface is not only to represent system
data in a meaningful and expected format but to support operator diagnosis. Ultimately the
interface design should activate or trigger the correct type of knowledge from memory
allowing the operator to generate a quick and accurate diagnosis of the disturbance.
Research has for many years recognised the importance of supporting the operator
through the interface, especially in complex human-machine systems where incidents are
frequently abnormal and unfamiliar, and hence the potential for disaster is great. Therefore
how to represent the dynamic properties of a system to an operator is one of the most
important research questions of the current period (Brehmer, 1995). There are few detailed
guidelines or methodologies associated with the design of such interfaces available to the
designers, thus rendering the process of interface design a relatively unstructured process

* Currently Head of Human Factors at ATMDC, NATS, UK.


Interface display designs 281

which is unsatisfactory given the significance of such displays (Seamster et al, 1997). A
poorly designed interface may conceal important diagnostic information, or alternatively the
presentation of such information on the display may not trigger the utilisation of the relevant
knowledge, and as a result the operator’s diagnostic ability may be impaired.
The research described in this paper is the application of the Types Of Knowledge
Analysis (TOKA) approach (Sturrock and Kirwan, 1996) to a classic nuclear power plant
training scenario (Steam Generator Tube Rupture1) focusing predominantly on the redesign
of the interface displays.

Type Of Knowledge Analysis (TOKA) approach


The TOKA approach to interface design is based on the premise that the differing demands
placed on the operator during abnormal or novel scenarios will require a different interface
design to the demands imposed during normal or steady state conditions. The TOKA
approach comprises 5 steps. The first step involves analysing the available knowledge
(concerning the disturbance from the interface) followed by (Step 2) an analysis of the
knowledge required by the operator to diagnose the disturbance. The third step is to
categorise such available and required knowledge according to the six ‘Type Of Knowledge’
(TOK) categories shown in Table 1. The fourth step involves identifying where the required
knowledge is not supported by the interface, and subsequently making recommendations
concerning the redesign of the interface. The final step in the TOKA approach is to test the
redesigned displays against the original displays.

Table 1. TOK Characteristics

Steam Generator Tube Rupture (SGTR) scenario analysis


Initially an in-depth scenario exploration and knowledge elicitation were carried out using both
the abundance of literature on this scenario (compared with other NPP scenarios), a very experienced
operator, and also a full-scope Pressurised Water Reactor simulator at the OECD Halden Reactor
Project in Norway. Steps 1–4 of the analysis formed part of an earlier study and readers are directed

1
A steam generator tube rupture scenario in a nuclear power plant, if undetected, could lead to the
radioactive contamination of the secondary side of the nuclear process, which could threaten the lives of
the plant personnel and eventually the environment.
282 F Sturrock and B Kirwan

to Sturrock and Kirwan (1997) for a detailed account of the methodology and results. Table 2
however shows an extract of the analysis. Although the SGTR scenario is a classic fault with
operators being trained to recognise the symptoms and to take corrective actions, the analysis
revealed that the knowledge or information required in order to diagnose the scenario was not
optimally supported by the interface, e.g. many of the main symptoms of the scenario are of low
salience, despite their relatively high relevance to the context of the disturbance.

Table 2. Extract of SGTR scenario results

TOKA-based Interface Designs


A detailed examination of the inert knowledge led to the generation of design concepts that
would potentially reactivate such knowledge. A summary of redesign recommendations
associated with each TOK are shown in Table 3. The structure of the interface re-design
process was aided via semistructured interviews with several complex process interface
designers in order to compensate for the lack of formal interface design guidelines.

Table 3. Interface redesign recommendations associated with TOK


Interface display designs 283

The resultant redesigned displays are accessible from the original displays (as they
support diagnosis rather than routine monitoring) via icons. Each icon represents a different
TOK which can be selectively called-up at any time and in any order. Figure 1 shows an
example of a redesigned TOKA display for the SGTR scenario which would support TOK 4
(maximum integration of systems).

Figure 1. TOKA-based interface design supporting TOK 4

Figure 1 shows all four steam generators (SG) and their direct link with the main steam
manifold therefore potentially triggering the knowledge that the manifold steam output will
remain constant despite the increasing levels in RY 13 (on the current interface the link is not
directly shown, i.e. it is inert). Also displayed is the steam output from the manifold via a trend
graph which shows the steam output over a period of time. This supports the operator by providing
indications concerning how the abnormal functioning of RY13 affects the other parts of the
plant (TOK 4). If the operator wants such information from the original displays then he/she
must know what systems are likely to be affected (assuming that the malfunctioning component
can be identified) and subsequently access the appropriate displays—this however, assumes
that such knowledge will be triggered from memory. Also displayed on this TOK4 format are
the emergency feedwater pumps, which show the direct link between the pumps in the feedwater
system and the SG. Such a design feature supports TOK 4 by emphasising the increased feedwater
supply to the SG which is an abnormal condition.
It is anticipated that such TOK supportive displays would result in operators generating
more accurate and rapid diagnoses of the disturbance by supporting the knowledge that is
required for diagnoses.

Discussion
The TOKA approach identifies parts of the original interface that do not currently support the
operator in terms of knowledge requirements through a highly detailed and structured
284 F Sturrock and B Kirwan

walk-through of the task and evaluation of the current interface. The redesign of the interface is
prioritised allowing the inclusion of the most important redesigned features (in terms of supporting
operators’ knowledge requirements for the task), unlike traditional interface design processes.
The potential utility of such a task analytical based tool to codify and re-present inert
knowledge has implications for interface design. Whereas traditional interface designs rely
heavily on designers expertise and experience the TOKA approach to interface design, although
still requiring the designer to be experienced, follows a five step structure producing displays
which have been designed specifically to support operators during diagnosis. The TOKA approach
itself leads the designer to search for these ‘Types Of Knowledge’ which may otherwise be
overlooked. Although this approach is still in its preliminary stages it addresses current research
problems such as what should be displayed and what should not be displayed to the operator.

Summary
The results of this analysis, briefly reported in this paper, show that it is possible to design
interface displays based on operator’s knowledge requirements. The SGTR scenario displays
have been redesigned according to the results of this study and the next phase of this research
will be to compare operator performance when using the redesigned interfaces compared
with performance using the original interface. The end goal of the research is to develop a
task analysis tool, based on the TOKA approach, which can be used prospectively for
determining interface requirements for operator diagnostic support and also for generating
recommendations for display design.

Acknowledgements: The authors would like to thank the HRP staff and the operator for their
co-operation, time and enthusiasm. Thanks must also go to the interface designers (from
Halden Reactor Project, British Nuclear Fuels Limited, and Rolls Royce Associates) for their
expertise and help.
Disclaimer: The opinions expressed in the paper are those of the authors and do not
necessarily reflect those of their respective organisations nor the HRP.

References
Brehmer, B., 1995, Feedback delays in complex dynamic decision tasks. In P.A.Frensch and
J.Funke (eds.), Complex Problem Solving: The European Perspective, (Lawrence
Erlbaum Associates, New Jersey), 103–130
Seamster, T.L., Redding, R.E., and Kaempf, G.L., 1997, Applied Cognitive Task Analysis In
Aviation, (Ashgate Publishing, Aldershot)
Sturrock, F., and Kirwan, B., 1996, Mapping knowledge utilisation by nuclear power plant
operators in complex scenarios. In S.A.Robertson (ed.), Contemporary Ergonomics,
(Taylor and Francis, London) 165–170
Sturrock, F., and Kirwan, B., 1997, Inert knowledge and display design. In S.A. Robertson
(ed.), Contemporary Ergonomics, (Taylor and Francis, London) 486–491
Vicente, K.J., 1995, Ecological interface design: A research overview. In T.B.Sheridan (ed.),
International Federation Of Automatic Control Symposium: Analysis, Design And
Evaluation Of Man-Machine Systems (Pergamon Press, Cambridge), 623–628
Woods, D.D., Johnassen, L.J., and Sarter, N.B., 1994, Cognitive systems factors, state of the
art report, Behind human error: Cognitive systems, computers and hindsight,
CSERIAC 94–01
UNDERSTANDING WHAT MAKES ICONS EFFECTIVE:
HOW SUBJECTIVE RATINGS CAN INFORM DESIGN

Siné J.P.McDougall*, Martin B.Curry † and Oscar de Bruijn*

*Department of Psychology †
Human Factors Department
University of Swansea Sowerby Research Centre
Singleton Park British Aerospace plc
Swansea. SA2 8PP. Filton, Bristol. BS12 7QW.

Icons and symbols are now routinely used across a wide range of user
interfaces. This has led researchers and designers to explore the mechanisms
that are thought to make icons effective. However, research has been
hampered by a lack of clarity about exactly which icon characteristics affect
user performance. To help researchers address this issue, we propose that
subjective ratings can be used to measure individual icon characteristics and
control them experimentally. This paper reviews the research that we have
carried out using subjective ratings to explore the effects of that icon
concreteness, complexity and distinctiveness have on user performance. We
make recommendations for design practice on the basis of these findings.

Introduction
The escalation in the use of icons to convey information has paralleled the routine
incorporation of computing technology within people’s everyday lives. To date the choice of
which icons to use has largely depended on international standards, guidelines, and
guesswork (e.g. Shneiderman, 1992). As a result, recent research has focused on the need to
arrive at a better understanding of the characteristics that make a good icon (e.g. Scott and
Findlay, 1991).
In order to arrive at a proper understanding of what makes icons effective, researchers
need to be able to identify the effects that different icon characteristics have on user
performance. This paper examines the success of previous research in realising this goal and
highlights the need for greater levels of experimental control.
A good way of controlling icon characteristics experimentally is to obtain subjective
ratings of each characteristic. Although there has been a long tradition of using subjective
ratings to control item characteristics for words and pictures (e.g. Quinlan, 1992), ratings
have yet to applied to icon research. This problem was addressed by obtaining ratings for a
corpus of icons on a variety of icon characteristics that included concreteness, complexity,
and distinctiveness. These ratings were used in a series of experiments which examined the
roles played by complexity, concreteness and distinctiveness on icon effectiveness. The
results of these experiments are reviewed in order to show how this methodology can be
applied to icon research. The data shows that each of these properties has different
286 SJP McDougall, MB Curry and O de Bruijn

behavioural effects and that these can change as a result of user experience. The implications
of these findings for icon design practice are discussed.

Concreteness and complexity


One of the strongest claims made for icons is that they are easier to use because they are
concrete. Concrete icons tend to be more visually obvious because they depict objects, places
and people that we are already familiar with in the real world. Abstract icons, in contrast,
have an indirect correspondence with our experience and typically represent information by
more ambiguous means using shapes, arrows and lines. The evidence available from both
research and practice suggests that users use the visual metaphor of the real world created by
concrete icons and that, as a result, they are easier to understand (e.g. Rogers, 1986). This
difference in meaningfulness between concrete and abstract icons has been referred to as the
‘guessability gulf’ by Moyes and Jordan (1993) since the meaning of concrete icons can be
guessed more easily than abstract icons when they are first encountered.
One way of accounting for the performance advantages observed for concrete icons arises
from the extra detail provided when depicting objects, places and people. Garcia et al (1991)
measured this detail by applying a complexity metric to icons used in previous studies and
found concrete icons to be consistently more complex than abstract icons (e.g. Arend et al,
1989; Rogers, 1986). This suggests is that concrete icons must be more complex in order to
be easier to use. This assumption, however, contrasts sharply with the recommendations of
design guidelines which emphasise the importance of keeping icons as simple as possible
(e.g. Gittens, 1986).
In order to determine which of these propositions was correct, we obtained subjective
ratings of concreteness and visual complexity for a large corpus of 240 icons. These were
drawn from icons in everyday use on machinery, cars, aircraft, computers and public
information signs. Forty raters were asked to assess the concreteness of each icon using a 1–5
rating scale. Another group of 40 were asked to rate the visual complexity of icons using the
same scale. A strong correlation between the perceived concreteness and visual complexity of
the icons, would support the idea that it is extra visual detail that enables users to employ the
visual metaphor. We found, however, that this correlation was virtually non-existent (r=–0.03,
p>0.05). What this means is that, contrary to previous research, the addition of extra visual
detail does not make concrete icons visually obvious.

The effects of concreteness and complexity on user performance


Given that users rate icon concreteness and complexity differently, we might expect that the
effects they exert on user performance will also be different. Indeed, research to date suggests
that while increasing concreteness enhances user performance (e.g. Rogers, 1986; Stammers
et al, 1989), increasing visual complexity diminishes it (Scott, 1993). This, however, does not
tell us anything about the locus of these effects on user performance and when they are likely
to be important.
A study was therefore carried out to examine the role of concreteness and complexity in
more detail. Four types of icons were selected from the 240 which had been previously rated.
These were (a) concrete and complex (b) concrete and simple (c) abstract and complex and (d)
abstract and simple. In all, there were 72 icons—18 of each type. Twenty volunteers were asked
Subjective ratings in the design of icons 287

to carry out a search-and-match task in which they were given an icon function and asked to
match it to one of the icons in the display (see Figure 1). Eight other icons were used as a
background, two of each type (see (a)—(d) above). The volunteers saw the full set of icons
many times in order to simulate growing experience. This made it possible to examine how
accurately and quickly volunteers could respond as they learned icon-function relationships.

Figure 1. An example of the computer screen display in the search-and-match task

During initial learning, icon concreteness appeared to be very important; it enabled icon
meaning to be guessed more accurately and quickly. After 9–10 exposures, however,
performance differences between concrete and abstract icons disappeared. This suggests that
concreteness effects are temporary and are likely to have minimal effects on icons that are
frequently used.
The visual complexity of icons was also found to affect the time that users took to respond,
although it had little effect on accuracy. Users took longer to respond to visually complex icons
than those which were simple. This finding was not affected by the number of times icons were
presented. This means that the visual complexity of the icons used in time-critical systems will
be an important determinant of performance that is unlikely to diminish with experience. Thus,
he design of such systems should place emphasis on the creation of simple interfaces rather than
hoping for any performance improvements that may result from experience.

Distinctiveness
Icon distinctiveness is another important factor in determining how easy icons are to use.
Defining distinctiveness is notoriously difficult and is often characterised in terms of how
discriminable icons are from one another. Distinctiveness is often thought to be the product of
an icon’s global features (e.g. shape, colour, size) rather than of its local features (i.e. details
within icons; see Arend et al, 1987). There is strong evidence that icons differing in their
global features can be found more easily in an array because visual search is quicker (e.g.
Scott, 1993). However, in the usability research carried to date it is often difficult to
distinguish distinctiveness from visual complexity. For example, Byrne (1993) states that
“simple icons (those discriminable based on a few [global] features) seem to help users,
while complex icons are no better than simple rectangles”.
288 SJP McDougall, MB Curry and O de Bruijn

Following on from our previous research, we used ratings to quantify icon distinctiveness.
A group 30 volunteers were asked to rate the extent to which they felt target icons ‘stood out’
from other background icons in an array of nine (similar to the array shown in Figure 1).
These icons were the same as those used to determine the effects of concreteness and
complexity. The targets were shown against a series of different backgrounds; mixed (using
the 4 different types of icons as in Figure 1 above) or wholly concrete, abstract, simple or
complex (forming a uniform background).
Subjective ratings suggested that icon distinctiveness was largely determined by the
contrast of a target’s characteristics in comparison to the characteristics of the background
icons. For example, simple target icons were rated as being most distinctive against a
background of complex icons. Similarly, concrete icons were rated as most distinctive when
presented against a background of abstract icons. There was little effect of distinctiveness
when icons were presented against a mixed background. To summarise, there appear to be
two types of contrast (a) a visual contrast that relates primarily to differences in icon
complexity (b) a semantic contrast that relates primarily to differences in the concreteness.
Taken together, these findings suggest that the source of distinctiveness effects is unlikely
to lie solely in visual search (as suggested in previous research) but may also affect users’
ability to select icons on the basis of their meaning. The effect of creating visual and semantic
contrasts on user performance was subsequently examined.

The effects of distinctiveness on user performance


Users were asked to complete the same search-and-match task as shown in Figure 1.
However, in this instance, the displays were designed to create a contrasting background for
the target icon. In one study visual contrasts were used, e.g. the lock (a simple icon) shown in
Figure 1 was set against a background of complex icons. In a second study semantic contrasts
were used, e.g. the lock (which is also concrete) was against a background of abstract icons.
So, how was user performance affected by these contrasts? Our findings show that generating
contrasts can be very effective in enhancing user performance but designers need to be aware
of exactly what type of contrasts were likely to be effective.
When visual contrasts were created by contrasting the target’s complexity with the
background display, it was clear that user performance was faster when the target was simple
and the background was complex. However, when a complex icon was presented against a
simple background user performance was not improved. The behavioural effects generated by
the use of contrast are, therefore, not symmetrical. Therefore, designers should not treat icon
contrasts as being two sides of the same coin.
When semantic contrasts were used in displays, similar effects were observed. Concrete
icons presented against an abstract background, enabled users to respond very quickly in the
search-and-match task. However, hen the contrast was reversed, with abstract icons being
presented against a background of concrete icons, user performance was not enhanced. Once
again, this shows the behavioural effects created by generating a contrast were not symmetrical.

Design implications of research findings


To conclude, we believe that our findings are clearly relevant to design practice. The
implications of our findings are summarised below.
Subjective ratings in the design of icons 289

• The addition of visual detail does not make icons more visually obvious.
• The difference in user performance between concrete and abstract icons known as the
‘guessability gulf’ is short-lived and does not affect experienced users.
• Concrete icons are likely to be most useful in public information systems or for icons that
are rarely used (such as warnings).
• The use of complex icons should be avoided in displays where user responses are time-
critical. Even experienced users respond more slowly to complex icons.
• Both visual and semantic contrasts can be created within displays.
• Contrasts do not always enhance user performance and it is important for designers to be
clear about exactly what contrasts are effective.

Acknowledgements
This research has been supported by British Aerospace grant (SRC/UCS/060495). The icons
in Figure 1 have been reproduced with permission of the British Standards Institute;
International Electrotechnical Commission; Microsoft Corporation.

References
Arend, U., Muthig, K.P. & Wandmacher, J. (1987). Evidence for global feature superiority
in menu selection by icons. Behaviour and Information Technology, 6, 411–426.
Byrne, M.D. (1993). Using icons to find documents: simplicity is critical. Proceedings of
INTERCHI ‘93, 446–453.
Garcia, M., Badre, A.N. & Stasko, J.T. (1994). Development and validation of icons varying
in their abstractness. Interacting with Computers, 6, 191–211.
Gittins, D. (1986). Icon-based human-computer interaction. International Journal of Man-
Machine Studies, 24, 519–543.
Moyes, J. & Jordan, P.W. (1993). Icon design and its effect on guessability, learnability and
experienced user performance. In J.D.Alti, D.Diaper & S.Guest, People and
Computers VIII. Cambridge: Cambridge University Press.
Quinlan, P. (1992). The MRC Psycholinguistic Database. Cambridge: Cambridge University
Press.
Rogers, Y. (1986). Evaluating the meaningfulness of icon sets to represent command
operations. In M.D.Harrison & A.F.Monk (Eds.), People and computers: Designing
for usability. Cambridge: Cambridge University Press.
Rohr, G. & Keppel, E. (1985). Iconic interfaces: Where to use and how to construct. In
H.W.Hendrick & O.Brown (eds.), Human factors in organisation design and
management. Amsterdam: Elsevier Science Publishers.
Scott, D. & Findlay, J.M. (1991). Future displays: A visual search comparison of computer
icons and words . In E.J.Lovesey, Contemporary Ergonomics: Proceedings of the
Annual Conference of the Ergonomics Society, 246–251. London: Taylor & Francis.
Scott, D. (1993). Visual search in modern human-computer interfaces. Behaviour &
Information Technology, 12, 174–189.
Shneiderman, B. (1992). Designing the User Interface: Strategies for Effective Human-
Computer Interaction. 2nd edition. Reading, MA: Addison-Wesley.
REPRESENTING UNCERTAINTY IN DECISION SUPPORT
SYSTEMS: THE STATE OF THE ART

Caroline Parker

HUSAT Research Institute


Loughborough University, LE11 1RG

There is increasing interest in the potential of Decision Support Systems


(DSS) in agriculture. DSS are usually based on simulation models with
which a degree of uncertainty is always associated. In response to the
practical problem of how to present this uncertainty to non-technical users a
literature review was undertaken. The maturity of this technology in other
industrial sectors led to the expectation that answers already existed. Results
so far suggest that this belief was unfounded. This paper is a first attempt to
collate the answers which do exist and some general guidelines for
presentation are given.

Introduction
This paper stems from a very specific practical problem, the need to produce a design
solution to an interface requirement for an agricultural decision support system. DESSAC
(Decision Support Systems for Arable Crops) is a MAFF Link-funded project which has as
part of its remit the development of a decision support system (DSS) for winter wheat
fungicide use.

A decision support tool (DSS) can be defined as a tool which helps the user to make better
decisions by providing access to a model or rule based representation of the decision area and
to supporting information. There is an increasing interest in the potential of these tools in the
agricultural and horticultural industries. Agricultural DSS, in common with any system
attempting to describe and predict natural processes, are not capable of giving definitive
answers. The emphasis is on support and not decision making. Agricultural DSS contain one
or more simulation models which approximate the interactions between biological systems.
As these models are only estimates there is always a degree of uncertainty associated with
their output, (often shown as a probability distribution). A major requirement at the interface
is the expression of this uncertainty as well as the general estimate of risk.

The DESSAC system needs to display a variety of solutions to a spray plan problem so that
the user can identify the best fit solution and the differences (or lack of these) between the
risk levels associated with them. Farm-based users need to know what the worst, best and
most likely outcomes, and the spread between them, might be, so that they can make realistic
comparative decisions.
What exactly is the problem in displaying something that the statistical sciences have been
expressing for a long time? Firstly, there is the non-technical background of the target audience.
The DESSAC project is working on the assumption that the user population will be computer
Uncertainty in decision support systems 291

literate but have little familiarity with statistics or the nature of modelling uncertainty. Previous
work in the area suggests that few agricultural users are aware of the limitations of models;
there is a tendency to either accept the output of the tool as 100% accurate or to view it with an
extreme degree of prejudice. Good communication of uncertainty is always critical to good
decision making (Cleaves, 1995) but obviously more so under these circumstances.

Secondly there is the underlying assumption that these systems should also improve the
general level of decision making. The reason that agricultural DSS are funded is the urgent
need to reduce, or to target more effectively, the use of agro-chemicals in the UK. The human
interface to the DSS therefore has three jobs to do: it has:

• to guide the user in the direction of better decision making,


• to present decision supportive information to the user, and
• to make best use of human capabilities. It has to do this in the context of a non-technical
user group, working with a complex problem in real time.

As this is a problem common to all decision support systems, an early literature review was
conducted in the strong belief that answers would already exist. Six months later few solutions
have been found. This paper is a first attempt to bring together the answers which do exist and
to place them in a context which is meaningful for the developers of decision support systems.

Types of uncertainty
DSS are based either on simulation models, or on rules extracted from domain experts, or on
a mixture of the two, and their answers to the questions posed by the user will always have a
degree of uncertainty around them. But what exactly do we mean by uncertainty in this
context? What types of uncertainty have been identified? Finding an appropriate
categorisation for uncertainty should make it easier to group design solutions in a way which
is useful for DSS developers.

Krause and Clarke (1993) provide a branching classification system in which types of
uncertainty are first divided according to whether they relate to a single proposition or to a
group of propositions; then whether they arise from ignorance or from conflict and finally
into 8 sub categories (op cit. p.7). The critical ones for this discussion relate to the unary set
and are: Indeterminate Knowledge (vagueness); Partial Knowledge (confidence);
Equivocation; Ambiguity; and Anomaly (error).

Table 1: Uncertainty types within information categories


292 C Parker

A DSS provides the user with many types of information, e.g. the set identified by Brookes
(1985). Each type of information may bring with it its own form of uncertainty. Table 1 above
lists Brookes’ information retrieval categories and suggests the types of uncertainty, as
defined by Krause and Clarke, to be found within them.

The table shows that different types of uncertainty are present in the various information
categories and that there is a further division, between input and method: uncertainty within a
DSS relating either to the data fed into the system, and/or to the method (models or rules)
used to generate the answers to decision enquiries (Arinze, 1989).

In the case of the first two of Brookes’ categories, the mechanisms for generating an answer
are purely mechanical and based on very simple and very complete algorithms. No
uncertainty is generated by mechanisms (models or rules) in these cases, the only possible
source of uncertainty is that associated with the data fed into them. As all data is prey to
anomaly (input error) and much real data contains missing values (indeterminate knowledge)
it must be assumed that these types of uncertainty are always be associated with inputs.

Where data is generated (i.e. by a model) rather than input there will be uncertainty
associated with the mechanism used for the generation (predictive information). In the case
of agricultural systems any prediction of weather conditions or disease progress will be prone
to ‘partial knowledge’ uncertainty because it is impossible to produce an accurate simulation
of these complex and chaotic systems. Any other use of models or rules is prone to the same
type of uncertainty. On the output side these systems may produce ambiguous results where it
is not clear which of a number of outcomes is preferable. Rule bases are also prone to
equivocation, where two or more rule sets are equally applicable.

Another level of categorisation which is particularly relevant to uncertainty in DSS is the type
of data being displayed i.e. whether it is nominal, ordinal, interval or ratio based numeric data
or whether it is textual. Expert systems, based on rules extracted from experts, may produce
numeric or textual data but the numeric data is likely to be an expression of an expert opinion.
Model based systems on the other hand produce numeric uncertainty based on the application
of equations to numeric inputs (inputs which may of course be based on non-numeric judgements).

Display solutions
Many papers describe the performance of graphical vs. tabular vs. textual displays for interval
and ratio type numeric data often with conflicting findings: others suggest that a mix of tabular
and graphical displays produce the best performance (e.g. Bennet, 1992) In general the literature
seems to suggest a graphical format is the easiest to interpret, even for small data sets (Melody
Carswell, 1997) and given the additional difficulties surrounding uncertainty a graphical format
may be considered to be the better approach. However, the influence of the graphical display is
also found to be highly dependent on the type of task it is intended to support. The right graphical
display is thus needed to express specific task uncertainties.

In the only experiment of its type this survey was able to locate, Ibrekk and Morgan (1987) look
specifically at the problem of graphically representing uncertainty to non-technical users. Two
types of users (non-technical and technically aware) were presented with 9 graphical displays
of the same data with and without instruction. The displays were: a point estimate with error
bar; six displays of probability density (discretised display, pie chart, conventional, mirror image
display, horizontal bars shaded to display density using dots or vertical lines.); a Tukey box
plot; and a cumulative distribution function. Six forms of the probability density display were
used because formally equivalent representations are not often psychologically equivalent.
Subjects were asked to make judgements about realistic events such as the depth of predicted
snowfall and flood. They found that the performance of a display depended on the information
Uncertainty in decision support systems 293

that the subject was trying to extract, and concluded that displays that explicitly show the
information people need show the best performance. Pie chart displays were found to be
potentially misleading and subjects displayed a tendency to select the mode rather than the
mean unless the mean was explicitly marked. Where subjects were asked to make judgements
about probability intervals in displays that did not forcefully communicate a sense of probability
density, there was a tendency for them to use a linear proportion strategy equivalent to an
assumption of a uniform probability density. Explanations had little effect on performance
although there was evidence that subjects were trying to use them. Another finding was that
there was little difference between the performances of the technical and non-technical groups
and suggesting that a ‘rusty’ knowledge of statistics, or a graduate degree, will not necessarily
improve performance. Designs which support non-technical users will therefore be equally
valuable to the technically literate.

The alternative to graphical or tabular representations of uncertainty are textual/verbal


representations. Budescu and Wallsten (1995) investigated information processing, choice
behaviour and decision quality when subjective uncertainty was expressed linguistically or
numerically. They identified two further categorisations of uncertainty, ‘precise’ and ‘vague’.
Uncertainty is precise if it depends on external, quantified random variation; and vague if it
depends on internal sources related to lack of knowledge or judgements about the nature of the
database (op. cit.) These distinctions map well onto Brookes information categories. Precise
uncertainty can be linked to the first three categories, vague uncertainty with the last two because
they relate very much to internal, user based uncertainty. In a later experiment Olson and Budescu
(1997) found that verbal representations outperform numeric ones when the nature of the
underlying uncertainty is also vague The best mode of communication they suggest, is the one
which most clearly matches the nature of the event and the source of its underlying uncertainty.

Summary
While there is insufficient room in this short paper to expand on, or even describe, all of the
data gathered in this exercise, it is possible to make broad recommendations. It would appear
that there are 3 key issues which impact on the form of representation for displaying
uncertainty:

• the type of uncertainty being displayed (e.g. Krause’s categories);


• the type of data being displayed (e.g. numeric, textual, precise or vague); and
• the users requirement for information to support the task (mean, mode, etc.).

In the latter case users may, or may not, need to see the reasoning behind the data. Tufte
(1997) argues quite forcibly that they do, Ackoff (1967) that they don’t. In the agricultural
domain it seems likely that many farmers will not want to see the reasoning behind the data
whereas most agronomy consultants and the more technically minded farmers will. The
answer seems to be user dependent, requiring a layered interface approach.

The literature surveyed to date would seem to suggest the use of graphics as a first choice for
representing numeric data of the ‘precise type’ i.e. relating to factual, instructive and
predictive information; with tabular representation providing additional support. Where the
data may be either numeric or textual and the type of uncertainty is ‘vague’ i.e. factual
inferential and causal inferential information, then textual representation is to be preferred.

The results of Ibrekk and Morgans experiment suggest that within graphical display types mirror
image displays and the shaded bar displays of probability density are the best for communicating
the ranges that variables assume, and box plots or simple error bars are the best way of
communicating means. It may be the case however that explicitly marking the mean on the
probability density display would produce the optimal solution for mean and range.
294 C Parker

Conclusion
The title of this paper refers to the ‘state of the art’ in the representation of uncertainty in
decision support systems and this paper has made a first attempt to define it. However while
there is a great deal of work to which this paper has not referred given the limited space, the
tragedy is that most of it is microscopic in scope. Published work deals with human limitations
in relation to different display mechanisms, to the nature of uncertainty, to the differences between
expert and non-expert users and to many other issues surrounding the problem. Very little is
directly relevant to the designer of decision support tools. Indeed this review has made it apparent
that not much has changed since the publication of Morgan and Henrion’s book in 1990 where
they concluded that: “for the most part the absence of empirical studies of the relative virtues of
alternative displays means that the choice for displays remains largely a matter of personal
judgement” (op. cit. p. 220). A great deal more targeted research is required if developers of
decision support tools are to be given the practical guidelines they need.

Acknowledgements
The author wishes to thank Loughborough University; the DESSAC project, MAFF and
HGCA for funding; Murray Sinclair; David Parsons of Silsoe Research Institute; and the
many leads and pointers provided by Email contacts and colleagues to their own and others
work in this area.

References
Ackoff, R.L. 1967. Management Misinformation Systems. Management Science, 14(Series
B), 147–156.
Arinze, B. 1989. Developing Decision Support Systems from a model of the DSS/User
Interface. In G.I.Doukidis (Eds.), Knowledge based management support systems
166–182. (Chichester: Ellis Horwood.)
Bennet, K.B. & Flach, J.M. 1992. Graphical Displays: Implications for Divided Attention,
and Problem Solving. Human Factors, 34(5), 513–533.
Brookes, C.H.P. 1985. A Framework for DSS Development. In Transactions of Fifth
International Conference on DSS.
Budescu, D.V. and Wallsten, T.S. 1995. Processing Linguistic Probabilities: General
Principles and Empirical Evidence. In J.Buyesmeyer Haties, R. and Medin, D.L.
(Eds.), The Psychology of Learning and Motivation. 275–318. (New York: Academic
Press.)
Cleaves, D.A. 1995. Assessing and Communicating Uncertainty in Decision Support
Systems. Policy Analysis. AI Applications, 9(3), 87–102.
Ibrekk, H. & Morgan, M.G. 1987. Graphical Communication of Uncertain Quantities to
Non-technical People. Risk Analysis, 7(4), 519–529.
Krause, P. and Clark, D. 1993. Representing Uncertain Knowledge: An Artificial Intelligence
Approach. (Kluwer Academic Publishers).
Melody Carswell, C. & Ramzy, C. 1997. Graphing small data sets: should we bother.
Behaviour and Information Technology, 16(2), 61–71.
Morgan, M.G. and Henrion, M. 1990. Uncertainty: A guide to dealing with uncertainty in
quantitative risk and policy analysis. (Cambridge, Massachusetts: Cambridge
University press).
Olson, M.J. and Budescu, D.V. 1997. Patterns of Preference for Numerical and Verbal
Probabilities. Journal of Behavioural Decision Making, 10, 117–131.
Tufte, E.R. (1997). Visual Explanations: Images and Quantities, Evidence and Narrative..
(Cheshire, Connecticut: Graphics Press.)
REPRESENTING RELIABILITY OF AT-RISK
INFORMATION IN TACTICAL DISPLAYS FOR FIGHTER
PILOTS

Maddalena Piras1, Stephen Selcon2, Jeffrey Crick2 and Ian Davies1

1
Department of Psychology, University of Surrey, Guildford, Surrey GU2 5XH
2
Human Factors Group, Systems Integration Department, Air Systems Sector, DERA, Farnborough

We report a study of representing the reliability of ‘at-risk’ information to


pilots. The Launch Success Zone (LSZ) shows the pilot whether they are
within firing range of an enemy missile. Here we compare four ways of
representing the reliability of LSZs. Three designs (qualitative; quantitative;
and graphical) displayed threat information above the symbols representing
enemy air craft, while a fourth display integrated threat information with the
representation of the LSZ. Using a visual search paradigm, it was found that
the graphical representation produced the fastest decisions, and that hostiles
with the highest risk levels were detected most quickly.

Introduction
We report a study that is part of a research program to identify optimal ways of representing
‘certainty’ information in tactical displays during air-to-air combat (see Selcon et al., 1995;
Crick et al., 1997). Head down displays (HDD) can represent whether the pilot is within the
firing range of enemy missiles (the Launch Success Zone or LSZ), but the reliability or
certainty of these boundaries is variable. However, it also possible to indicate the reliability of
theses boundaries, and the questions we address here are should such information be included
in HDDs, and if so how?
Previous work by Selcon et al. (1995) and Crick et al. (1997) found that pilots could make
tactical use of LSZs. Further, these LSZs were used most effectively when depicted in a
graphical format. Similarly, Kirschenbaum and Arruda (1994) found that graphical forms of
representing uncertainty for ships produced better performance than verbal representations.
The present experiment extended Selcon et al.’s investigation by including information
representing the reliability of the LSZs. Specifically, we compared four ways of representing
certainty: qualitative; quantitative; graphical and integrated. The qualitative display
represented certainty with abbreviations above the symbol for the enemy aircraft; e. g. VL
(very low). The quantitative display gave the certainty information as percentage risk scores
placed above the enemy symbol. The graphical display represented certainty by the length of
a bar positioned above the enemy. And the integrated display represented certainty by how
continuous the line depicting the LSZ was.
296 M Piras, S Selcon, J Crick and IRL Davies

Method

Subjects
There were 16 civilian subjects, all members of DERA, with ages ranging from 19–30 years;
there were eight men and eight women. All had normal or corrected to normal eyesight, and
none were pilots.

Apparatus
The stimuli were displayed on a Silicon Graphic ZX workstation and the keyboard was used
as the response device. The display showed the ‘ownship’ (the pilot’s own location) at the
bottom of the display, represented by a triangle with a direction indicator (a line) protruding
from it (see Figures 1–4). All displays showed three symbols representing enemy aircraft:
circles with direction indicators. The enemy aircraft were at headings of either 180°, 150°,
120° or 90°. Each hostile aircraft was ranked in terms of its threat. Thus, the hostile could
either be of a high, medium or low threat Each enemy aircraft had its own Launch Success
Zone (LSZ). The LSZ of one hostile was always covering the ownship. In figures 1–4 the
ownship is represented by a triangle and the hostile aircraft by circles. The LSZs are shown as
discrete regions displaced from the respective hostile in the direction of flight Thus the
direction indicators point at the respective LSZ, while the reliability of the LSZ is shown in a
variety of ways, as follows.

1) Qualitative: (Figure 1) symbols for enemy


aircraft had lettering above it representing the level of
certainty: VL (very low certainty); L (low certainty);
QL (quite low certainty); QH (quite high certainty); H
(high certainty) and VH (very high certainty).

Figure 1.
Qualatative representation of uncertainty

2) Quantitative: (Figure 2) numbers were


positioned above each enemy aircraft representing the
level of certainty. The percentage levels were as
follows: 6%, 18%, 32%, 66%, 78% and 93%. The
lowest percentage represented the lowest certainty, and
the highest percentage represented the highest
certainty.

Figure 2.
Quantitative representation of uncertainty
Reliability of at-risk information for fighter pilots 297

3) Graphical: (Figure 3) a bar was positioned above


each enemy aircraft with shading corresponding to the
level of certainty. The more shading the bar had, the
more reliable the LSZ information was.

Figure 3.
Graphical representation of uncertainty.

4) Integrated: (Figure 4) the continuity of the line


representing the LSZ was varied to represent certainty or
reliability. The more continuous the line, the greater the
certainty.

Figure 4.
Intergrated representation of uncertainty.

There were sixteen base scenarios. Each scenario was used six times. Each time, however,
a different combination of certainty levels were displayed. Only one aircraft at a time
represented the highest certainty, no two aircraft would display the same certainty at a given
time. Scenarios were counterbalanced due to the variations in position of the enemy aircraft
on the screen.
Each condition consisted of 96 stimuli plus five practice stimuli.

Procedure
Participants were asked to respond to the enemy aircraft which had the highest certainty.
They responded by making a keyboard response in the form of aircraft 1, 2 or 3. Before the
start of each condition, specific instructions for that condition were given. When the practice
trial was completed participants had the opportunity to ask any questions. The experimental
trial then started. The experimental trial was subdivided into four blocks of 96 trials. The
order in which conditions were presented to each subject was counterbalanced using a Latin
square design.
298 M Piras, S Selcon, J Crick and IRL Davies

Results
Data from one subject was discarded due to high error scores (over 50%) leaving the total
number of subjects at 15. Figure 5 shows the mean reaction times across subjects for each
kind of display and each risk level.
The clearest trend that can be seen in Figure 5 is that RTs for the graphical display are
lower than for the other three types. In addition, RTs to high risk symbols seem to be lower
than for other risk levels. Two-way ANOVA (display by threat) supported these impressions.
Both main effects were significant: display (F=82.1; d.f.=3,42; p<0.001); and threat level
(F=17.2, d. f.=2,28; p<0.001). In the first case, the significant effect is due to the graphical
display producing the fastest performance, while high risk symbols also produced fast
performances. There was no suggestion of any interaction between the two main effects.

Figure 5. Mean RTs for each level of threat and for each type of display

Discussion
There were two clear effects in the results. First, the graphical representation of reliability
produced the fastest RTs. The size of this effect was substantial: RTs to the graphical display
were about 0.75 seconds faster than to the next fastest symbols. Second, there was an effect of
threat level: medium threat levels produced the slowest RTs and high threat levels produced
the fastest RTs. However, the size of the threat effect was small relative to the symbol effect
(about 0.15 seconds) and probably reflects subjects searching the display radially about the
ownship at the bottom of the display (see Figures 1–4).
Reliability of at-risk information for fighter pilots 299

The four types of symbols evaluated in this study were already under consideration by the
display designers. While the trends in our results are clear in supporting the use of the
graphical display, it is of course possible that some other design would produce better
performance than the current best design. The choice of symbols is largely governed by ‘craft
knowledge’ as there is no adequate theory of symbol processing that fits all possible types of
symbol. However, given the size of the effect found, and the significance of such an effect in
the fast moving world of the fighter pilot, it is worth extending the research to evaluate other
display symbols, and the stability of results across a range of tasks and expertise.

Acknowledgements
We are very grateful to colleagues at DERA, Craig Shanks, Maitland De Souza, Susan
Driscoll and Alex Bunting, for valuable discussions and help with the experiment. We would
also like to thank Nigel Woodger of University of Surrey, for laying out the paper.

References
Crick, J.L., Selcon, S.J., Piras, M., Shanks, C., Drewery, C., and Bunting, A. 1997,
Validation of the explanatory concept for decision support in air-to-air combat.
Proceedings of Human Factors and Ergonomics Society 41st Annual Meeting, in
press.
Kirschenbaum, S.S. and Arruda, J.E. 1994, Effects of graphic and verbal probability
information on command decision making. Human Factors, 36(3), 406–418
Selcon, S.J., Bunting, A., Coxell, A., Lal, R., and Dudfield, H. 1995, Explaining decision
support: an experimental evaluation of an explanatory tool for data-fused displays.
Proceedings of the 8th International Symposium on Aviation Psychology, 1, 92–97,
Columbus, OH
Semantic Content Analysis of Tasks Conformance

Alex Totter, Chris Stary

University of Linz
Department of Business Information Systems, Communications Engineering
Freistädterstrasse 315, 4040 Linz, Austria

Design principles and usability measurements, such as task conformance, are


widely used in the course of information system development and user interface
evaluation. Although there exist commonly accepted frameworks for these
principles and measurements, such as the ISO-standard 9241 Part 10, techniques
for development and evaluation vary to a great extent when implementing these
principles and measurements. As a consequence, the results of utilizing the
principles and measurements for software development and evaluation lack
quality in terms of reliability, validity and objectivity. In order to overcome
this deficiency, semantic content analyses, and further on, analytical definitions
capturing the meaning are required for each of the principles and measurements.
In this paper the results of the semantic content analysis for one of the major
principles, namely tasks conformance, are reported. The presented semantic
content analysis has been performed on six techniques for user-interface
evaluation that contain different interpretations of tasks conformance. The results
should be used to avoid the further diversification in interpreting design
principles and evaluation measurements.

Introduction
Design principles and usability measurements, such as task conformance and adaptation, are
widely used in the course of user interface development and evaluation. They are part of
design and evaluation methodologies, such as EVADIS II (Oppermann et al., 1992),
respectively, international standards, such as ISO 9241—Part 10 (1990), or directives, such as
90/270/EEC (EU-Directive, 1990). Their understanding of task conformance is mostly based
on the following interpretation: “A dialogue supports task conformance, if it supports the user
in the effective and efficient completion of the task. The dialogue presents the user only those
concepts which are related to the task” (ISO 9241 Part 10 1990).
In order to gain insights into the concept and practical impact of task conformance for
design and evaluation, first, a conceptual, and secondly, the semantic content analysis provide
the basics for determining the semantics (meaning) of the principle itself and related
measurements. Such a specification of meaning lays ground for an analytical definition of the
principle. This definition can then be used to develop reliable, objective, and valid techniques
for the development and the evaluation of user interfaces. Figure 1 illustrates the addressed
cycle for the improvement of quality in general: In a first step, the principles and
measurements that are part of standards as well as techniques for design and evaluation are
identified. For each of the principles and measurements the descriptions as well as their
utilization in different techniques have to be acquired, compared, and analyzed. This second
step is termed semantic content analysis. In case the use of a principle or measurement in
Semantic content analysis of tasks conformance 301

different techniques leads to different descriptions, further activities are required to ensure
proper understanding. These activities comprise an explicit identification of the meaning
(=analytical definition) through an analysis of the meaning, as for instance proposed by Bortz
and Döring (1995), of a principle or measurement. Meaning analysis increases the
transparency of the subsequent operational definition, since it provides the terminological and
conceptual framework for the development of techniques for user interface development and
evaluation. Figure 1 shows this transition and its result on the right side. Once the semantics
of a principle or measurement has become transparent, its operational definition can be
performed on a sound epistemological basis.

Figure 1. Methodological Framework for Quality Improvement

In this paper we focus on the semantic content analysis of a particular principle for design
and evaluation, namely tasks conformance. The benefits of the analysis are demonstrated
through elaborating the terminological and conceptual deficiencies in the context of
developing a proper technique of evaluation.

The Investigation
The inputs to the semantic content analysis have been extracted from the following
techniques: ABETO (Technology Consulting Nordrhein-Westfalen, 1994), Ergonomics-
Checker (Technology Consulting Nordrhein-Westfalen, 1993), EVADIS II (Oppermann et al,
1992), Evaluating Usability (Ravden, Johnson, 1989), IsoMetrics (Willumeit et al, 1996),
Software Checker (TCO, 1992). The selection of these techniques was based on the criteria
of availability and accuracy: (i) How difficult is it to get access to the technique and use it
practically?; (ii) Does it provide a description of task conformance similar to standards, such
as ISO 9241 part 10 (1990)? For the analysis only those parts of the techniques have been
considered for the analysis that focus on the evaluation of software. Most of the techniques
are based on the ISO-standard 9241 Part 10.
According to the goal of our study it had to be investigated, whether the selected
techniques provide a theoretically sound operational definition of task conformance based on
their descriptive interpretations of this principle.
The semantic content analysis has been based on all of the questions of the six selected
techniques. Overall, 74 questions have been identified with respect to task conformance
exclusively. For each of the techniques the identified questions have been cross-checked for
302 A Totter and C Stary

mutual semantic correspondence. In order to ensure objectivity the entire set of cross-checks
has been performed by two independent evaluators who are experts in the field software
ergonomics. Figure 2 shows this first step of the semantic content analysis. This step
identifies the redundant questions of task conformance within the set of techniques under
investigation (see first and second column of table 1).

Figure 2: Mutual Cross-check of Techniques with Questions Concerning Tasks


Conformance Exclusively

In the second step of the qualitative content analysis the questions of each technique have
been checked mutually against those questions of all other techniques that have not been
related to task conformance initially. Again, the semantic correspondence of the questions has
been checked.

Figure 3: Cross-check of Questions Concerning Task Conformance (TC) with Non-


Task Conformance-Questions of Other Techniques

Figure 3 details one iteration of this step. The first input to the analysis, namely the set of
questions concerning task conformance, remains identical. However, in contrast to step 1, the
second input is the set of questions of the other technique(s) that is not directly related to task
conformance. The second input is the set of questions that has not been involved in step 1. For
each of the techniques the set of cross-checks as shown in Figure 3 has been performed, again
by two independent experts in the field of software ergonomics. Step 2 has been considered to
be completed when all the questions had been cross-checked for semantic correspondence(s).
The results of this step can be represented in a correspondence matrix (partly shown in Table
1). This step has led to all those questions that can not only be assigned to task conformance
but also to other principles. Such multiple assignments indicate problems of validity, due to
mutual dependencies.
Semantic content analysis of tasks conformance 303

Table 1: Semantic Correspondences

Following the tradition of qualitative studies (e.g., Bortz, Döring, 1995) the results of the
previous steps are analyzed in a descriptive-statistical way. The analysis of results of step 1
(mutual cross-checks of TC-questions) has led to a semantic correspondence of 22 out of 73
questions concerning task conformance exclusively—one question has been found three
times in different techniques in the context of task conformance.

Table 2: Number of TC-Questions Also Assigned to Other Categories

The analysis of the results of step 2 (cross-checking TC-questions against questions


assigned to other categories of measurement) leads to the answer of: Are there any questions
that concern task conformance (TC-questions), and are these questions utilized to measure
other principles than task conformance? Overall, 38 out of 73 TC-questions correspond
semantically to questions that are assigned to other principles in other techniques. Given 51
304 A Totter and C Stary

TC-questions, i.e. the reduced set of questions according to the 22 semantic correspondences
found in step 1, 25 questions (about 50 %) have a semantic correspondence to questions
assigned to other categories of measurement. Particular questions have been used up to 5
times to measure other principles than task conformance. Overall, 47 TC-questions have been
found in other categories than TC. Table 2 shows the principles that contained TC-questions.
The leading principle is controllability, followed by self-descriptiveness and
individualization.

Conclusions
Although the principles for designing and evaluating user interfaces are based on common
frameworks, their operational definition has led to results that lack reliability, objectivity, and
validity. In order to overcome these deficiencies, a semantic content analysis and a
subsequent meaning analysis of the principles and measurements are required. In this paper, a
first step towards the empirically sound operational definition has been made through
performing a semantic content analysis.
The semantic consistency of one of the major design principles and measurements,
namely task conformance, has been examined based on 6 techniques for user interface
evaluation. Like most of the results of qualitative analyses the results of the semantic content
analysis should be used for further empirical work. In our case the follow-up investigation
should comprise a meaning analysis of task conformance enabling an analytical definition of
this principle.

References
Bortz, N. & Döring: Research Methods and Evaluation (in German), 2nd edition. Springer,
Berlin, 1995.
EU-Directive 90/270/EEC: Human-Computer Interface. Occupational Health and Safety for
VDU-Work (5th Directive, Art. 16 Par. 1 of the Directive 89/391/EEC). In: EU-
Bulletin, Vol. 33, L 156, Minimal Standard (Art. 4 &5), Par. 3, p. 18, 21.06.1990.
ISO 9241 Part 10: Ergonomic Dialogue Design Criteria, Version 3, Committee Draft,
December, 1990.
Oppermann, B. Murchner, H. Reiterer, M.Koch: Ergonomic Evaluation. The Guide EVADIS
II (in German), de Gruyter, Berlin, 1992.
Ravden, S. & G.Johnson: Evaluating Usability of Human-Computer Interfaces. Ellis
Horwood, Chichester, 1989.
TCO, Swedish Confederation of Professional Employees: Software Checker—An Aid to the
Critical Examination of the Ergonomics Properties of Software, Handbook and
Checklist, Sweden, 1992.
Technology Consulting Nordrhein-Westfalen: Ergonomics-Checker (in German) Technik und
Gesellschaft, Vol. 14, Oberhausen, 1993.
Technology Consulting Nordrhein-Westfalen: ABETO—Work Sheets. Oberhausen, 1994.
Ulich, E.: Psychology of Work (in German). 3rd edition, vdf, Zürich, 1994.
Willumeit, G. Gediga, K.Hamborg: IsoMetrics: A Technique for Formative Evaluation of
Software in accordance to ISO 9241/10 (in German). In: Ergonomie und Informatik,
March (1996), 5–12.
WARNINGS
WARNINGS: A TASK-ORIENTED DESIGN APPROACH

Jan Noyes* and Alison Starr**

* Department of Experimental Psychology, University of Bristol


8 Woodland Road, Bristol BS8 1TN, UK
** Smiths Industries Aerospace, Cheltenham GL52 4SF, UK

Air traffic continues to increase; a trend which is expected to persist well into
the next century with a concomitant increase in accident and incident rates
(Last, 1995). Given the prevalence of human error in aircraft operations, the
design of the warning system is of paramount importance since it is often
provides the crew with the first indication of a potential problem. This paper
will discuss the findings from recent research on civil aircraft warning systems
carried out at Smiths Industries Aerospace in conjunction with the University
of Bristol and British Airways; some of the issues associated with the design
of current warning systems will be considered. It is concluded that the use of
task-oriented as opposed to fault-oriented handling of information may be the
way forward for the next generation of warning systems.

Designing for Error


Humans make errors, although most of the time these errors are inconsequential with no ill or
long term effects. However, in safety-critical systems such as those involved with aircraft
operation, human error may have catastrophic effects. Although it is not possible to prevent
humans from making mistakes, every attempt must be made when designing systems to
minimise the opportunities for human error, and for remedial actions to be carefully
planned for easy assimilation and execution by the crew. The point of contact between the
flight-deck crew and the aircraft informing them of critical changes in the state of various
aircraft systems is usually the warning system. Consequently, special attention needs to be
applied to its design in order to accommodate any errors which the crew might make
(Billings, 1997).
When considering the causes of aviation incidents and accidents, human error is
|implicated in a large number. Figures differ according to definitions and method of
calculation, but human error has been given as a causal factor in 80% of fatal aircraft
accidents in general aviation and 70% in airline operations (Jensen, 1995). Recent statistics
indicate there were 1063 accidents world-wide in commercial jet aircraft between 1959 and
1995 of which 64.4% cited flight crew as a primary cause (Boeing, 1996).
Warnings: a task-oriented design approach 307

However, it should be noted that human error is a portmanteau expression that does not
differentiate between errors made due to lapses in professional skills and errors arising due to
ordinary human failings.
The area of human error has been well-researched, although the development of a precise
definition and in-depth understanding of human error continues to prove to be a difficult and
elusive goal. A common viewpoint exemplified by Rasmussen (1987) is that human errors arise
because of a mismatch between the human and the task or human-machine misfits. Frequent
misfits are likely to be considered design errors, while occasional misfits may arise due to
variability on the part of the system (component failures) or the human (human errors). Both
external and internal factors may be responsible for this mismatch, although it is generally
thought that internal traits, e.g. skill levels, are not as influential as external factors in their
contribution to human error. External performance shaping factors of relevance to the design of
aircraft warning systems might include: (i) inadequate human engineering design, e.g. violation
of population stereotypes resulting in sequence and selection errors; (ii) inadequate work space
and work layout, which may contribute towards fatigue, decreased productivity and increased
errors; (iii) inadequate job aids, e.g. poorly written manuals and procedures may lead to
uncertainty and errors on the part of the operator (see, Noyes and Stanton, 1997).
In the avionics application there are specific difficulties associated with studying human
error. For example, in some situations, accidents are likely to be catastrophic. As a result, evidence
about the cause of the accident is often lost. The main participants may be deceased, thus
hampering the search for the causes of the errors. This type of situation is exacerbated by our
limited understanding of the role of the human operator in accident processes (Kayten, 1989).
In summary, the complexities of human behaviour make studying human error a
challenging task with many difficult theoretical problems (Leplat, 1987). There exists no
single theory or model which predicts the occurrence of human errors, which would provide
an initial step towards learning more about the causes, and hence, the prevention of errors.
Often there is no right or wrong decision, only the best decision for any given set of
characteristics and for any given point in time. Often, the outcome in terms of the results of
making errors and subsequent decisions is not known until later. Consequently when
designing systems, failure to know in detail why a human error occurs makes the
development of a solution strategy both difficult and inefficient.
When considering methodologies for the study of human error, these are very different
from studying behaviour based on simple, rule-directed decisions. Laboratory studies of
human error and post-hoc analyses of incidents which involve collecting individual accounts/
reactions, etc. are often not fruitful in terms of yielding definitive information about the
causes of making errors (Nagel, 1988). Consequently, the approach taken here was to use a
self-report technique developed through extensive observation and interview studies, and in
conjunction with analyses of accident and incident data.

Flight-deck Crew Survey


The findings presented here emanate from a questionnaire survey of 1360 commercial flight-
deck crew (representing a return rate of just over 40%). This questionnaire on aircraft
warning systems was developed through an extensive knowledge elicitation process; for
example, user requirements of current and future warning systems were assessed by taking a
308 JM Noyes and AF Starr

descriptive (do you have this feature/function?) followed by a prescriptive approach (would
you like it?). Respondents were asked to state the extent of their agreement with 51
statements on a 7-point Likert scale from ‘strongly agree’ through to ‘strongly disagree’.
Further details concerning this methodological approach are given in Noyes, Starr and
Frankish (1996).
The purpose of a warning system is manifold, from alerting the crew to an actual or potential
malfunction, through to providing evidence of the problem, and guidelines for remedial actions.
Consequently, the features of a good warning system include the provision of a complete set of
warnings with enough information (to anticipate problems before they arise, and to be aware of
them when they do arise), guidance to deal with the situation, provision of information about
secondary consequences, and the reduction of ‘false’ and ‘nuisance’ warnings.

Brief Summary of User Responses


Findings from the questionnaire survey indicated that the majority of respondents felt that
warning information was complete in terms of appropriateness of warnings to a given
situation1, sufficient to identify the problem(s)2, and providing direction towards corrective
procedures3 (with over 80% of crew agreeing that the warnings in their aircraft are effective
in directing them towards appropriate actions). But in general, they did not think that current
warning systems provided adequate information about secondary consequences of
malfunctions4 5, and this was a feature which they viewed favourably. In addition, the
questionnaire results indicated that crew would like warning systems to have a predictive
capacity allowing the anticipation of problems6 7. False warnings were generally thought to
be undesirable8, but were not viewed as a significant problem; this is presumably because
they are a rare occurrence (although one respondent made the comment that “one false
warning is ‘too often’”). A clear need was expressed by crew to have systems adept at
handling multiple warnings9, and it was felt that current systems were not as supportive as
they might be in these situations10.

Footnotes
1
Descriptive statement—“All warnings appropriate to the situation are given”
2
Prescriptive statement—“The warnings provided in my aircraft are usually sufficient to identify
immediately the source of the problem”
3
Descriptive statement—“Warnings that are given are effective in directing me to appropriate
procedures for dealing with the problem”
4
Descriptive statement—“Flight-deck displays provide adequate information about secondary
consequences of malfunctions (e.g. inoperative systems, restrictions on operational procedures, etc.)”
5
Prescriptive statement—“Flight-deck displays should provide information about secondary
consequences of malfunctions”
6
Descriptive statement—“The flight-deck instrumentation available in my aircraft is effective in
enabling problems to be anticipated before warnings are triggered (e.g. by indicating parameters that are
slightly in error, but still within tolerance)”
7
Prescriptive statement—“Flight-deck instrumentation should enable problems to be anticipated”
8
Descriptive statement—“False warnings appear too often”
9
Descriptive statement—“It is easy to interpret warning displays when several warnings appear at the
same time”
10
Prescriptive statement—“When several warnings conditions are active, only the most important
should be displayed”
Warnings: a task-oriented design approach 309

Discussion
The provision of warning information on civil flight-decks has changed significantly over the
years from the early distributed warning lights through to the introduction of multifunction
displays with associated system schematics and checklists (see, Starr, Noyes, Ovenden and
Rankin, 1997, for a full history of the evolution of warning systems on civil aircraft).
Although this research programme has only been conducted with a single airline with 10
different aircraft types (and as such may not fully represent the views of flight-deck crew in
other airlines), it is generally accepted that current commercial aircraft warning systems have
the ability to provide a large amount of data. However, they tend not to:
(i) integrate data from several sources into a format determined by the current situation,
e.g. phase of flight;
(ii) allow anticipation of malfunctions by conveyance of prediction information
concerning abnormal conditions to the crew;
(iii) provide advanced indication of the consequences of crew decision-making and
actions.
It could therefore be concluded that most conventional warning systems are fault-oriented,
and corrective actions are directed towards management of the immediate problem; priorities
being determined according to a pre-determined hierarchy. Current warning systems tend to
present a large amount of ‘unprocessed’ information, which essentially lacks integration and
situational modification across and within sources to aid diagnosis and corrective actions.
Although the basic functions of warning systems are unlikely to change significantly in the
future, it is likely that the amount of information which can be made available to the crew will
continue to increase as it extends to include more system parameters and external events.
Looking to the future, part of the solution to improve upon current systems may be to
develop task-oriented warning systems. Future warning systems could perhaps aim to rectify
this potential crew ‘information overload’ situation by providing information tailored to the
overall aircraft situation, offering a range of options to be evaluated in the light of future
operational requirements. The development of ‘soft displays’ supported by powerful
computational resources could facilitate the development of task-oriented warning systems
which provide information tailored directly to users’ current requirements. This type of
interface should aid the performance of the human operator in terms of the management and
presentation of warning information; this in turn should reduce the opportunities for human
error. It will also be in keeping with the views expressed by the current user population of
civil flight-decks as demonstrated in this research programme.

Acknowledgements
This work was carried out as part of a UK Department of Trade and Industry funded project,
IED: 4/1/2200 ‘A Model-Based Reasoning Approach to Warning and Diagnostic Systems for
Aircraft Application’.
Thanks to British Airways for their participation in this research programme, and
especially to all flight-deck crew who completed interviews and questionnaires.
Thanks are also due to the late David Eyre for the meticulous statistical analyses carried
out on the questionnaire data.
310 JM Noyes and AF Starr

References
Billings, C.E. 1997, Aviation Automation: The Search for a Human-centred Approach,
(LEA, New Jersey)
Boeing 1996, Table of all accidents—World-wide commercial jet fleet, Flight Deck, 21, 57.
Jensen, R.S. 1995, Pilot Judgement and Crew Resource Management, (Avebury Aviation,
Aldershot)
Kayten, P. 1989, Human performance factors in aircraft accident investigation. In
Proceedings of the 2nd Conference on Human Error Avoidance Techniques, Herndon,
VA, (SAE International, Warrendale, PA), Paper 892608, 49–56
Last, S. 1995, Hidden origins to crew-caused accidents. In Proceedings of IFALPA
Conference, Interpilot, June Issue, 5–15
Leplat, J. 1987, Some observations on error analysis. In J.Rasmussen, K.Duncan and J.
Leplat (eds.), New technology and human error, (Wiley, Chichester), 311–316
Nagel, D.C. 1988, Human error in aviation operations. In E.L.Wiener and D.C.Nagel (eds.),
Human factors in aviation, (Academic Press, San Diego), 263–303
Noyes, J.M., Starr, A.F. and Frankish, C.R. 1996, User involvement in the early stages of
the development of an aircraft warning system. Behaviour & Information Technology,
15(2), 67–75.
Noyes, J.M. and Stanton, N.A. 1997, Engineering psychology: Contribution to system
safety. Computing & Control Engineering Journal, 8(3), 107–112.
Rasmussen, J. 1987, The definition of human error and a taxonomy for technical system
design. In J.Rasmussen, K.Duncan and J.Leplat (eds.), New technology and human
error, (Wiley, Chichester), 23–30
Starr, A.F., Noyes, J.M., Ovenden, C.R. and Rankin, J.A. 1997, Civil aircraft warning
systems: A successful evolution? In Proceedings of IASC ‘97 (International Aviation
Safety Conference), edited by H.M.Soekkha, (VSP BV, Rotterdam, Netherlands),
507–524
EFFECTS OF AUDITORILY-PRESENTED WARNING
SIGNAL WORDS ON INTENDED CAREFULNESS

Rana S.Barzegar and Michael S.Wogalter

Ergonomics Program, Department of Psychology


North Carolina State University
Raleigh, North Carolina 27695–7801 USA

This study investigates whether signal words such as DANGER,


WARNING, and CAUTION, presented under different vocal conditions,
influence intended compliance. Male and female participants listened to
cassette tapes of signal words presented by a male or female speaker in
monotone, emotional, and whisper voice styles at either a low or high sound
level. The results showed that female speakers produced significantly higher
ratings of intended carefulness. Of the five signal words examined,
DEADLY received the highest ratings, followed by DANGER; and NOTICE
received the lowest carefulness ratings. WARNING and CAUTION did not
differ. The safety implications of these results are discussed.

Introduction
Current warning design standards and guidelines recommend the use of signal words to alert
individuals to the presence and level of potential hazards. Standards and guidelines in the US
generally recommend DANGER, WARNING, and CAUTION to indicate high to low levels
of hazard, respectively (e.g., ANSI, 1991; FMC Corporation, 1985). According to ANSI
(1991) these terms have been assigned the following definitions. DANGER should be used to
indicate immediate hazards that will result in severe personal injury or death. WARNING is
recommended for use with hazards or unsafe practices that could result in severe personal
injury or death. Finally, CAUTION is recommended for hazards or unsafe practices that
could result in minor personal injury and/or product or property damage. Research has
consistently shown that people do, in fact, perceive DANGER to connote a significantly
greater hazard than both WARNING and CAUTION, but people do not differentiate between
the two latter terms (e.g., Wogalter and Silver, 1990; 1995). Other research has investigated
whether alternate terms, such as DEADLY and LETHAL, are useful in conveying different
hazard levels (Wogalter and Silver, 1990; 1995).
All previous research on signal words has evaluated their effectiveness as presented
visually in the print medium. Although there is research on nonverbal auditory warning
signals (e.g., see Edworthy and Adams, 1996 for a review), there has been no research on the
effects of auditory/voiced/verbal signal words. The present research is an initial attempt to
examine the effects of voiced signal words on connoted hazard (intended carefulness ratings).
Previous studies suggest that voiced warnings have potential for effective warning
312 RS Barzegar and MS Wogalter

communication. Wogalter and Young (1991) and Wogalter et al. (1994) showed that voiced
warnings produced greater compliance than the same message in print. One benefit is that the
receivers of the information do not need to be looking in a particular direction, as would be
needed with visually presented information (Wogalter and Young, 1991; Wogalter et al.,
1994). Another benefit of voiced warnings is their potential utility for informing those who
have difficulty reading the English language, including children and individuals with vision
problems. With recent advancements in digital speech technology, voiced warnings could be
used to communicate hazards of various types under various conditions.
The present study examines the effects of signal words presented in monotone, emotional,
and whisper voices on intended compliance. Sound levels (dBA) were manipulated (low vs.
high) with the amplitude levels equated among the three voicing methods. Mershon and
Philbeck (1991) found that a whisper presented at the level of normal speech is significantly
more salient and arousing than normal speech. In addition, gender was examined with respect
to both the speaker (i.e., presenter or source) and the participant (i.e., listener or receiver).
Although 43 words were used as stimuli in this research, the present article describes the
results of the five terms that have been investigated most extensively in previous research
(DEADLY, DANGER, WARNING, CAUTION, and NOTICE). Three of these terms,
DANGER, WARNING, and CAUTION, are recommended by ANSI (1991) to indicate high
to low levels of hazard, respectively. Previous research by Wogalter and Silver (1995) and
Wogalter et al. (1997) has shown that DEADLY connotes a substantially greater hazard than
DANGER, NOTICE is a nonhazard related term recognized by ANSI (1991) to call attention
to important information (Westinghouse Product Safety Label Handbook, 1981).

Method

Participants
Seventy-two undergraduate students taking an introductory psychology course at North
Carolina State University participated. They were compensated with credit towards the
course. An equal number of males and females participated.

Stimulus materials
The signal words were taken from a list of 43 words investigated by Wogalter and Silver
(1995). They are shown below in alphabetical order:

ALARM DON’T LETHAL REQUIRED


ALERT EXPLOSIVE NECESSARY RISKY
ATTENTION FATAL NEEDED SERIOUS
BEWARE FORBIDDEN NEVER SEVERE
CAREFUL HALT NO STOP
CAUTION HARMFUL NOTE TOXIC
CRITICAL HAZARD NOTICE UNSAFE
CRUCIAL HAZARDOUS POISON URGENT
DANGER HOT PREVENT VITAL
DANGEROUS IMPORTANT PROHIBIT WARNING
DEADLY INJURIOUS REMINDER

The above words were arranged in 18 random orders, each recorded on a separate audio
cassette tape. The recordings were produced in a sound chamber using a Marantz PMD201
professional portable cassette recorder, Audio-Technica ATR30 vocal/instrument
microphone, microphone stand, TDK DS-X90 audio tapes and Koss TD/60 enclosed ear
headphones.
Effects of warning signal words on intended carefulness 313

Each speaker produced three recordings, one in each voicing method (monotone, emotional,
and whisper) with a different random order word list for each. Each recording consisted of
signal words presented at a rate of 8 s intervals (onset to onset) with a quiet period between each
word. Three male and three female speakers were used to make the recordings.

Procedure
Participants were informed that they would hear a series of words presented on three cassette
tapes. The instructions were to listen to each word and rate “How careful would you be after
hearing each word?” based both on its meaning and on how it is presented. Ratings were
made on a 9-point Likert-type scale with the following verbal anchors placed at the even-
numbered points: 0—not at all careful, 2—slightly careful, 4—careful, 6—very careful, and
8—extremely careful.
Each participant heard three tapes, monotone, emotional, and whisper, in different random
orders. Sound level (low: 60 dBA vs. high: 90 dBA) and speaker gender (male vs. female)
were manipulated between participant genders. All tapes heard by a given participant were
presented either at the low or high sound level and by a male or female speaker. Participants
were randomly assigned to conditions based on a schedule such that an equal number of
males and females participated in the sound level and word order conditions an equal number
of times.

Results
The data were examined using a 2 (Sound level: low vs. high) X 2 (Speaker gender: male vs.
female) X 2 (Participant gender: male vs. female) X 3 (Voicing method: monotone vs.
emotional vs. whisper) X 5 (Signal Words: DEADLY vs. DANGER vs. WARNING vs.
NOTICE vs. CAUTION) mixed-model design analysis of variance (ANOVA). The last two
variables, voicing method and signal words, were repeated measures factors; all others were
between-subjects factors.
The ANOVA showed a significant main effect of speaker gender, F(1, 60)= 13.95, p<.001.
Female speakers (M=5.10) produced higher carefulness ratings than male speakers (M=4.18).
Although participant gender failed to reach the conventional p level generally considered
necessary for significance, F(1, 60)=3.82, p=.055, the means showed the tendency for male
participants (M=4.9) to give higher ratings for intended carefulness than female participants
(M=4.4).
The ANOVA showed a significant main effect of voicing method F(2, 120)= 6.86, p<.01.
Comparisons among the means, using Tukey’s Honestly Significant Difference (HSD) test,
showed that the emotional voicing method (M=4.93) produced significantly higher
carefulness ratings (p<.05) than the monotone (M= 4.30). The whisper voice style (M=4.68)
was intermediate and was not significantly different from the other two conditions.
In addition, a significant main effect was found for signal words F(4, 240)= 137.80,
p<.001. Tukey’s HSD test showed that all paired comparisons were significant (DEADLY,
M=6.35; DANGER, M=5.28; WARNING, M=4.44; CAUTION, M=4.25; and NOTICE,
M=2.87), except between WARNING and CAUTION.

Table 1. Means as a function of voicing method and signal word


314 RS Barzegar and MS Wogalter

The ANOVA also indicated the presence of three significant interaction effects. Table 1
presents the means for the interaction between voicing method and signal words, F(8,
480)=2.56, p<.01. The emotional voicing method produced significantly higher ratings than
the monotone for both WARNING and NOTICE. In addition, NOTICE voiced emotionally
was rated higher than NOTICE whispered. DEADLY whispered was rated higher than
DEADLY voiced in monotone. There were no significant voicing-method differences for
CAUTION and DANGER.

Table 2. Means as a function of speaker gender and signal word

Speaker gender and signal word interacted, F(4, 240)=6.82, p<.001. The means in Table 2
show that female speakers consistently produced higher carefulness ratings than male
speakers for all signal words, except NOTICE. These 2 factors interacted with sound level in
a three-factor interaction of sound level, speaker gender, and signal words, F(4, 240)=3.37,
p<.05. The means for this interaction, displayed in Table 3, depict a similar pattern to the
speaker gender by signal word interaction described above, with two relatively minor
magnitude changes as a function of sound level. The speaker gender difference is larger for
DEADLY in the low sound level condition and for WARNING in the high sound level
condition. Note that the greatest intended carefulness was produced with DEADLY spoken in
a low level female voice.

Table 3. Means as a function of sound level, speaker gender, and signal word

Discussion
Various parameters of auditorily-presented signal words can affect receivers’ intended
carefulness. For the most part, emotionally toned voices produced the highest carefulness
ratings, particularly compared to the monotone voices. Perhaps the higher ratings for the
emotional tone is a reflection of the way people would naturally vocalize a hazard. In
emergency-type communications, people become excited and emotional speaking at a higher
pitch and at a faster rate. Therefore, the emotional tone may cue listeners to the urgency of the
situation. Research has shown that nonverbal auditory signals presented at a faster rate and at
higher frequencies increase perceived urgency (Edworthy and Adams, 1996). Related to this
is the higher carefulness ratings when the signal words were presented by female speakers.
Effects of warning signal words on intended carefulness 315

This concurs with previous findings showing that higher physical frequencies (i.e., the female
voice) produce greater perceived urgency (Edworthy and Adams, 1996).
The perceived hazard levels associated with the signal words were ordered high to low as
follows: DEADLY, DANGER, WARNING, CAUTION, and NOTICE. This order is
consistent with previous research of visually presented signal words (Wogalter and Silver,
1995). Several other results were also consistent with previous research of visually presented
signal words. First, there was no significant difference between WARNING and CAUTION
on perceived hazard (i.e., intended carefulness) (Wogalter and Silver, 1990). Second,
DEADLY was consistently rated higher than DANGER (Wogalter & Silver, 1995; Wogalter
et al., 1997). Third, the low ratings for NOTICE for both male and female participants
reflects the fact that this term has no specific hazard-related implications. Several complex
interactions were noted in the analysis. We will withhold elaborate explanations until there is
additional evidence and replication.
Clearly these results have implications for safety. Modern technology has provided voice
recordable transistor chips (found in greeting cards, answering machines), which when
combined with one or more detection systems (e.g., motion, infrared, sound) can potentially
communicate effective, timely warnings. Only a few of the many sound parameters were
investigated in the present study. Other parameters of voice warnings still need to be
examined.

References
ANSI. 1991, American national standard on product safety signs: Z535.1–5, (American
National Standards Institute, New York)
Edworthy, J., and Adams, A. 1996, Warning Design: A Research Perspective, 129–178
FMC Corporation. 1985, Product safety sign and label system, (Santa Clara, CA: Author)
Mershon, D.H., and Philbeck, J.W. 1991, Auditory perceived distance of familiar speech
sounds, Paper presented at the Annual Meeting of the Psychonomic Society, (San
Francisco, CA)
Westinghouse Printing Division. 1981, Westinghouse product safety label handbook,
(Trafford, PA: Author)
Wogalter, M.S., Frederick, L.I., Herrera, O.L., and Magurno, A.B. 1997, Connoted hazard
of Spanish and English warning signal words, colors, and symbols by native Spanish
language users. Proceedings of the 13th Triennial Congress of the International
Ergonomics Association, IEA ‘97, 3, 353–355
Wogalter, M.S., Racicot, B.M., Kalsher, M.J., and Simpson, S.N. 1994, The role of
perceived relevance in behavioral compliance in personalized warning signs.
International Journal of Industrial Ergonomics, 14, 233–242
Wogalter, M.S., and Silver, N.C. 1990, Arousal strength of signal words. Forensic Reports,
3, 407–420
Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted strength and
understandability by children, elders, and non-native English speakers,
Ergonomics,38, 2188–2206
Wogalter, M.S., and Young, S.L. 1991, Behavioural compliance to voice and print warnings.
Ergonomics, 34, 79–89
LISTENERS’ UNDERSTANDING OF WARNING
SIGNAL WORDS

Judy Edworthy, Wendy Clift-Matthews & Mark Crowther

Department of Psychology
University of Plymouth
Drake Circus
Plymouth PL4 8AA

This paper presents two studies which look at the interaction between the
arousal strength of signal words and the way in which they are spoken. In the
first study, listeners rated the urgency, appropriateness and believability of
eight signal words which were presented in either an appropriate, or an
inappropriate, voice tone by human speakers. Listeners’ judgements of all
three measures was strongly affected by the way in which the words were
spoken. In a second study the words were presented in a synthesized format
and were subjected to some basic urgency modelling. Results from this study
were more ambiguous, although differences in urgency were noted. The
research implications of the findings are discussed.

Introduction
Much evidence exists to show that the perceived urgency of nonverbal auditory warnings can
be influenced by their acoustic structure. For example, it has been demonstrated that
warnings which are higher in frequency, louder, faster, and vary along a number of other
dimensions such as pitch contour, amplitude envelope, rhythm and so on are rated as being
more urgent than warnings with lower values of these parameters (Edworthy et al, 1991).
There is evidence also to show that peoples’ responses to warnings designed in such a way as
to sound acoustically urgent or nonurgent vary in important, practical ways (e.g. Bliss et al,
1995). Increasingly, speech warnings are used where other, more traditional types of
warnings might have been used. Thus an interesting research question arises as to the extent
to which the urgency, as well as the believability and appropriateness, of speech warnings can
be influenced by those same acoustic parameters that influence nonverbal auditory warnings.
In particular, there is the question as to the interaction between the semantic content of a
speech message and the way, acoustically, it is presented. This paper presents a pair studies
which begin to look at this interaction. They form the basis of a more comprehensive set of
ongoing studies which look at the design and response to speech warnings in multitask
environments.
Listeners’ understanding of warning signal words 317

Study 1: Natural speech and signal words


It is well established that some words typically used on warning labels, usually known as
signal words, vary systematically in their arousal strength (e.g. Wogalter & Silver, 1995). For
example words like Deadly and Danger always score higher ratings than words like Don’t
and Note. The most stable of these words can be used to create a scale of semantic urgency
which can them be manipulated acoustically in order to address the interaction between the
semantic content of the word, and the way in which it is presented acoustically. For example,
the extent to which the urgency of a word such as Danger is influenced by the way in which is
spoken can give us some insight into the way the acoustic structure of the word, and its
semantic meaning or strength, interact and contribute to the overall impression of the word.
As acoustic analysis of speech sounds is complex, we decided in the first instance (and as a
precursor to experiments which will involve acoustic analysis) simply to ask two speakers,
one male and one female, to speak a set of eight signal words in both an appropriate and an
inappropriate manner, leaving the speakers to decide how to say each of the words.

Method
Two human speakers, one male and one female, were asked to speak the eight signal words
Lethal, Deadly, Poison, Danger, Beware, Warning, Attention and Don’t in both an appropriate
and an inappropriate manner. The speakers were left to decide how the words should be
spoken. Forty-three participants were asked to rate three features of these words: first, the
urgency on a 0–100 scale; second, the appropriateness of each of the words on a 1–8 scale;
and third the believability of each of the words on a 1–8 scale. Each of the scales was selected
because of their similarity with earlier studies which had asked participants to rate these
dimensions. Each stimulus was heard twice by each participant, in a randomised order. The
stimuli were presented on cassette tape.

Results and Discussion


Three sets of measures was taken for each word: its urgency, its believability and its
appropriateness. The results for the urgency measure revealed a sex difference (with the
female voice producing higher scores overall than the male voice). However, this may be due
to individual differences between the speakers, so is not emphasized here. The central finding
for the urgency measure is that main effects were found for both style of speaking
(appropriate or inappropriate) and signal word, as well as an interaction between style and
signal word for the female speaker (F=102.98, df=1, p<.001 for style, F=9.06, df=7, p<.001
for signal words and F=10.39, df=7, p<.001 for the interaction between style and signal
word). For the male speaker, a main effect for style was found (F=89.78, df=1, p<.001) and
an interaction between style and word was found (F=3.03, df=-7, p<.005).
These results show that the way in which each of the speakers spoke the words had a very
large effect on the urgency of the words. Words spoken in an appropriate manner were judged
to be considerably more urgent than those presented in an inappropriate manner. However,
the word itself also had a fairly prominent effect on listeners’ judgements, producing an
interaction in each case and a main effect in the case of the female speaker. The results for the
female speaker in particular show that the words already known to possess higher levels of
arousal produced higher ratings of urgency. Generally, the pattern shown was that the words
remained more or less in their expected order, from Deadly at the top to Don’t at the bottom,
318 J Edworthy, W Clift-Matthews and M Crowther

but with the scores for the appropriate words being higher than those for the inappropriate
words.
The results for the appropriateness and the believability measures demonstrated a very
similar pattern. For the appropriateness measures, separate 2-way ANOVAs on the female
and the male stimuli (inappropriate/appropriate×word (8 levels)) revealed main effects for
both appropriateness, and the interaction between appropriateness and word (F =1033, df=1,
p<.001 for female appropriateness, F=422, df=1, p<.001 for male appropriateness, F=5.325,
df=7, p<.001 for the interaction between appropriateness and word for the female speaker
and F=2.6168, df=7, p<.05 for the interaction between appropriateness and word for the male
speaker). The effect for word itself was significant in the case of the female speaker, but not
in the case of the male speaker. Thus the majority of the variance was accounted for by the
contrast between scores for the appropriately, and the inappropriately spoken, words.
The pattern for the believability was very much the same as the pattern for the
appropriateness measures, for both speakers. Similarly to the appropriateness measure, the
factor accounting for most of the variance was that the contrast between the appropriately
spoken words and the inappropriately spoken words. Thus the pattern of response was similar
across all three measures, showing effects for signal words in line with previous research for
written words (e.g. Wogalter & Silver, 1995) and very large effects for the way in which a
word is spoken.

Study 2: Synthesized speech and signal words


Previous studies on nonverbal warnings have shown that factors such as intensity, frequency
and speed have considerable effects on the perceived urgency of auditory warnings (e.g.
Edworthy et al, 1991). It is likely that these were amongst the key factors that our two live
speakers varied in producing their appropriate and inappropriate versions of the signal words.
These three factors would seem to be the primary (and technically the most readily available)
features which can be manipulated on almost any digitized speech generation system, so we
decided simply to take a very basic synthesizer to explore whether manipulation of these
features could produce discernible changes in listeners’ judgements of urgency,
appropriateness and believability in a similar manner to live speakers.

Method
Two ‘speakers’, one male and one female, were chosen from a set of available voice types on
a ‘Text’LE’ program found within a ‘Soundblaster 64’ sound system run on a PC using
Windows 95. Each speaker was given the same eight words as before—Lethal, Deadly,
Poison, Danger, Beware, Warning, Attention and Don’t—and the appropriateness of each of
the words was manipulated by setting the pitch and speed levels of one version of the word
considerably higher than the other version of the word. The former were then labelled
‘appropriate’ and the latter ‘inappropriate’. Each of the words was then recorded on digital
tape for use in the study.
Each stimulus was then presented twice to 43 participants, who were asked to rate the
urgency, appropriateness and believability of each of the words as in Study 1.
Listeners’ understanding of warning signal words 319

Results and Discussion


Again, three sets of measures was taken for each word: its urgency, its believability and its
appropriateness. In the case of this study, the male and female data were combined and
analyzed in a series of 3-way sex×word×style ANOVAs. For the urgency measures, no effect
was found between the two speakers (F=0.512, df=1, p= .48). A significant effect was found
for word (F=10.42, df=7, p<001), as was also a significant effect for style (appropriate vs
inappropriate) (F=12.40, df=1, p<.005). Interactions were found between speaker and word,
and word and style. The major part of the variance here was accounted for by the effects of
word and style and so are largely similar to those effects found for the live speakers.
Some departure from the live speaker results were demonstrated by the appropriateness
and the believabihty measures, however. Although some significant results were obtained,
similar to those found earlier, the most striking thing about both the appropriateness and
believability measures was that no significant difference was found between the ‘appropriate’
and ‘inappropriate’ versions of each of the words on either the appropriateness and
believability measures. In other words, the basic acoustic manipulations that we applied did
not influence listeners’ judgements of the believability and appropriateness of the words,
even though some effect for urgency was found.
The results for these two measures show strong effects for word (F=4.8, df=7, p <.001 for
appropriateness scores and F=7.46, df=7, p<.001 for believability scores), again in line with
earlier findings for written signal words. Some interactions were obtained, as before.
No effects were found for speaker across all three sets of data.

General Discussion
This pair of experiments bring forward a number of interesting research points which will
need to be elaborated more fully in future studies. However, the key points to emerge from
the results can be summarised as follows.
First of all, the results of both experiments show clearly that signal words already known
to vary in their arousal strength when presented in visual form (e.g. Wogalter & Silver 1995)
produce the same general pattern of results when presented in spoken form: words known to
be high on their arousal strength such as Deadly and Danger are rated consistently higher
than words such as Attention and Don’t. This was true across each of the experiments, on
every rating. There are some minor inconsistencies but as a rule the results mirror those for
written words in a striking way.
The second main feature of the results is that the way in which a word is rated is tempered
by the way in which it is spoken. Clearest of all is that words spoken in an appropriate
manner (in a style freely chosen by the speaker in this case) produce much higher ratings of
urgency, appropriateness and believability than those spoken in an inappropriate way. (The
results are less clear for the synthesized speech, which we will come to later). In Study 1 any
sort of acoustic analysis was purposely avoided, because we were interested primarily as to
whether speakers can convey appropriateness and inappropriateness to listeners: the results of
Study 1 show very clearly that they can, and that listeners are sensitive to these contrasts. In
fact, Study 1 shows that the most important factor in the study was the style of speaking,
which is an area where fruitful research might be carried out in the future. A secondary aspect
of these results is that they also show that the urgency, appropriateness and believability of
320 J Edworthy, W Clift-Matthews and M Crowther

signal words can be reduced or increased by the way in which they are spoken. For example,
the word Deadly can be made less urgent by speaking it in an inappropriate manner. In some
ways this is similar to the effect that can be obtained by varying the colours used to
emphasise signal words. Braun et al (1994) show for example how colours and words can
trade off against one another depending upon how they are combined. The same may be true
for spoken signal words and the way in which they are presented acoustically.
Turning to the results for the digitized words, and their comparison with live speakers, the
results show that at least to some extent urgency can be altered by manipulating pitch and
speed variables: thus as a first pass, such manipulations might be adequate for design
purposes and if intensity was also included such designs and manipulations might be quite
effective. However, the results for believability and appropriateness suggest that such
manipulations do not make the ‘urgent’ words any more believable and appropriate than their
‘nonurgent’ counterparts. This distinction was very clearly delineated for the live speakers,
and no doubt this is because a live speaker is doing very much more with the words when
speaking them in the two styles than simply raising the pitch, speeding up the words and
making them louder. The numerous interactions which were obtained between word and
speaker, and speaker and style, also draw attention to the subtlety of the interaction between
speaker and listener which is taking place. This exploration of this interaction, as well as the
detailed acoustic analysis which will need to be performed in order to understand it more
fully, forms the next phase of this research programme.

References
Bliss, J.P., Gilson, R.D. and Deaton, J.E. 1995, Human probability
matching behaviour in response to alarms of varying reliability
Ergonomics, 38, 2300–2312
Braun, C.C., Sansing, L., Kennedy, R.S. and Silver, N.C. 1994, Signal
word and colour specifications for product warnings: an
isoperfomance application. Proceedings of the 38th Annual
Conference of the Human Factors and Ergonomics Society
(Human Factors and Ergonomics Society, Santa Monica), 1104–8
Edworthy, J., Loxley, S.L. and Dennis, I.D. 1991, Improving auditory
warning design: relationship between warning sound parameters
and perceived urgency. Human Factors, 33, 205–31.
Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted
strength and understandability by children, elders, and non-native
English speakers. Ergonomics, 38, 2188–2206.
PERCEIVED HAZARD AND UNDERSTANDABILITY
OF SIGNAL WORDS AND WARNING PICTORIALS
BY CHINESE COMMUNITY IN BRITAIN

Angela K.P.Leung & Elizabeth Hellier

Department of Psychology,
City University,
London, EC1V 0HB

This study investigated the hazard perceptions and understandability ratings of


43 signal words and 12 warning pictorials between the Chinese population and
English population in London. The results showed that for all 43 signal words,
the understandability ratings in the Chinese subjects were significantly lower
than those of the English subjects, although no significant difference was found
for the eight commonly used signal words such as DANGER and WARNING.
Also, the Chinese subjects were found to have similar perceived hazard levels
of signal words as the English subjects. A shorter list of 12 signal words was
selected based on understandability. The results of pictorials on both
comprehension rates and understandability ratings of the Chinese subjects were
found to be significantly lower than the English subjects. The implications of
these findings for hazard communication are discussed.

Introduction
Most standards and guidelines on warning design recommend the use of signal words (e.g.
DANGER) on warnings for the purpose of calling attention to the safety sign and conveying
the degree of potential seriousness of the hazard (FMC Corporation, 1985). The standards
usually recommend the signal words DANGER, WARNING, and CAUTION to denote the
highest to lowest levels of hazard, respectively. However, research in this area has been
equivocal. While some studies (Dunlap et al, 1986) have found significant differences in
connoted hazard between the words DANGER and CAUTION; other studies (e.g. Leonard et
al, 1986; Wogalter et al, 1987) reported no reliable differences between risk ratings of the
words DANGER, WARNING and CAUTION
Studies in the USA have used participants other than students as subjects (Wogalter and
Silver, 1995), however little research has been earned out in Britain to assess if the non-native
English speaking population perceive the same level of the hazard with the commonly used
signal words such as DANGER and WARNING. Since the Chinese community is the third
322 AKP Leung and E Hellier

largest minority ethnic community in Britain, they were used as subjects to see if they
perceived the hazard levels of the signal words in the same way as the English population.
Warning designers have increasingly made greater use of pictorials in hazard
communications. Some research found that warning pictorials and icons might be useful in
assisting hazard communication when the verbal information cannot be read or understood
(Leonard and Karnes,1993), nevertheless other studies (e.g. cited by Casey,1993) found that
pictorials are not always easy to understand.
There are four hypotheses in this study:

a) for signal words, the Chinese subjects will have lower understandability ratings
than the English subjects
b) for signal words, the Chinese subjects will have different hazardousness ratings
from the English subjects
c) no significant difference is expected in the understandability rating between the
Chinese subjects and the English subjects.
d) there will be no significant difference in the comprehension rates between the
Chinese subjects and the English subjects

Another purpose of this study is to develop a list of potential signal words that probably
would be understandable to the Chinese population as well as the English population.

Method

Subjects
Ninety-six subjects participated in this study: 48 Chinese subjects and 48 English subjects. The
Chinese subjects were able to read and speak English but whose first language was not English.

Stimuli and Materials


All subjects were asked to complete a questionnaire which consists of questions on signal
words and warning pictorials. The presentation and question orders of the questionnaire were
randomised. The 43 signal words which were used in the study of Wogalter and Silver (1995)
were included in the questionnaire. Subjects were given two questions to rate on. The first
question was ‘How much hazard, do you think, is implied by this word?’ A 9-point rating
scale with anchors: (0) no hazard, (2) slight hazard, (4) some hazard, (6) serious hazard, (8)
extreme hazard. The second question was ‘How understandable, do you think, is this word?’
A 9-point rating scale with anchors : (0) not at all understandable, (2) somewhat
understandable, (4) understandable, (6) very understandable, (8) extremely understandable.
For warning pictorials, twelve signs which conform to the British Standard and to the new
Safety Signs at Work Regulations 1994 were used. The 12 signs included: 1) seat belt must be
worn, 2) helmet must be worn, 3) breathing mask must be worn, 4) guard must be used, 5)
eye wash, 6) fire exit, 7) fire extinguisher, 8) wet floor, 9) corrosive substance, 10) risk of
explosion, 11) no entry and 12) do not operate. All pictorials were shown in black and white
and measured approximately 4.5cm×5cm.
There were two tasks for the subjects. The first task was to write down the meaning of the
pictorial as specifically as possible. The second task of the subjects was to rate on this
Understandability of warning by Chinese community 323

question on each pictorial: ‘How understandable, do you think, is this pictorial?’ A 9-point
rating scale with anchors: (0) not at all understandable, (2) somewhat understandable, (4)
understandable, (6) very understandable, (8) extremely understandable.

Procedure
Subjects were tested individually. All subjects were told that they were to complete a
questionnaire which was about some signal words and warning pictorials, both of which are
sometimes used to indicate hazard.

Results

Signal words
A 2 (Chinese and English subjects)×8 (signal words: NOTE, ATTENTION, NOTICE,
CAREFUL, DANGER, CAUTION, WARNING, DEADLY) ANOVA was performed using
understandability rating as the dependent variable. The ANOVA did not show a significant
main effect of nationality, F(1,94)=2.67, p>0.05. This result suggested that there was no
significant difference in the understandability ratings of the eight commonly used signal
words between the Chinese and English subjects. However, an independent t-test found that,
for the 43 signal words, the mean of the means of understandability ratings of the Chinese
subjects (M=4.98) was significantly lower than that of the English subjects (M=5.64), t(84)=-
2.67, p<0.05. This indicates that apart from these eight commonly used signal words, there
were some signal words, e.g. HALT, PROHIBIT, FORBIDDEN, etc. which appeared to have
lower understandability ratings in the Chinese subjects than in the English subjects.
A 2×8 (signal words: NOTE, ATTENTION, NOTICE, CAREFUL, DANGER,
CAUTION, WARNING, DEADLY) ANOVA was performed using hazardousness ratings as
the dependent variable. The ANOVA did not show a significant main effect of nationality,
F(1,94)=2.65, p>0.05. This result suggested that there was no significant difference in the
hazardousness ratings of the eight commonly used signal words between the Chinese and
English subjects and that it did not support the second hypothesis.
However, there was a significant main effect of signal word, F(7,658)=308.64, p<0.01, with
DEADLY, DANGER, WARNING, CAUTION, CAREFUL, ATTENTION, NOTICE, NOTE
rated from the greatest to least on overall hazardousness i.e. when hazardousness ratings were
collapsed across nationality. The results of subsequent Newman-Keuls tests found that DANGER
was rated significantly higher on hazardousness than either WARNING or CAUTION.
Nonetheless, WARNING was not rated significantly higher on hazardousness than CAUTION.

Warning pictorials
A 2 (Chinese and English subjects)×12 pictorials ANOVA was performed using
understandability ratings as dependent variable. The result showed a significant main effect
of nationality, F(1,94)=8.91, p<0.05 and suggested that the mean of understandability ratings
of the Chinese subjects (M=3.38) was significantly lower than that of the English subjects
(M=4.16). This result did not support the third hypothesis
A 2×12 pictorials ANOVA was performed using comprehension rates as dependent
variable. The ANOVA showed a significant main effect of nationality, F(1,94)=10.15, p<0.01
and suggested that the mean of comprehension rates of the Chinese subjects (M=0.58) was
324 AKP Leung and E Hellier

significantly lower than that of the English subjects (M=0.70). This result did not support the
fourth hypothesis.

Discussion
The result of ANOVA for the eight commonly used signal words (NOTE, ATTENTION,
NOTICE, CAREFUL, DEADLY, including three most commonly used: CAUTION,
WARNING, DANGER) suggested that there was no significant difference in the
understandability ratings between the Chinese and the English subjects. One possible reason
for this finding could be, as Wogalter and Silver (1995) suggested, that in the limited
exposure to the English Language, the Chinese received training on the intended meanings of
these commonly used words (perhaps through formal instruction, or paying close attention to
the gradations of English word meanings or to verbiage on products manufactured by
English-speaking countries.)
However, further test result which indicated for all 43 signal words that the
understandability ratings of the Chinese subjects were significantly lower than those of the
English subjects. The results suggested that apart from these eight commonly used signal
words, there were some signal words, e.g. HALT, LETHAL, PROHIBIT, FORBIDDEN, etc.
which appeared to have lower understandability ratings in the Chinese subjects than in the
English subjects. Therefore, it is recommended not to use these less common words when the
target population is the Chinese Community in Britain.
One purpose of this study was to construct a list of words that would be understandable to
the Chinese population as well as the English population, In order to produce a list from the
data of this study, three criteria were used:

1) words received mean understandability ratings less than 4.0 were excluded.
2) words for which the standard deviation exceeded 2.0 were excluded.
3) words with significant difference in understandability ratings between the Chinese and
English subjects were excluded.

Using the three criteria, 31 words were eliminated. The 12 remaining words are: CAREFUL,
CAUTION, HARMFUL, SERIOUS, DANGER, FATAL, DANGEROUS, HAZARD,
HAZARDOUSNESS, TOXIC, EXPLOSIVE and POISON. This list of words derived would
be interpretable by both the Chinese and English population. Lists such as this one and as
well as that of Wogalter and Silver (1995) would be useful to individuals designing warnings
and for selecting alternative words that convey various hazard levels including substitutes for
other words. A designer should select words that are the most understandable to the target
population with significant differences along the hazard dimension.
The results suggested that there was no significant difference in the hazardousness ratings
for all 43 signal words between the Chinese and English subjects and that the Chinese
subjects did perceive similar connoted hazard levels from the set of signal words as the
English subjects.
However, the results showed that DEADLY was rated significantly higher on
hazardousness than DANGER. DANGER was rated significantly higher on hazardousness
than either WARNING or CAUTION. Nonetheless, WARNING was not rated significantly
higher on hazardousness than CAUTION. These results concurred with the findings of other
Understandability of warning by Chinese community 325

studies such as Wogalter and Silver (1990) and Dunlap et al (1986) that it did not support the
difference between WARNING and CAUTION as asserted in standards and guidelines (FMC
Corporation, 1985).

Warning pictorials
The result of ANOVA suggested that there was a significant difference in the
understandability ratings of the 12 pictorials between the Chinese and English subjects and
indicated that the understandability ratings of the Chinese subjects were significantly lower
than those of the English subjects. This result gives support to the tragedy in Baghdad in 1972
cited by Casey (1993), that pictorials are not always easy to understand although the language
barrier has supposedly been lifted. Therefore, it is very important for designers to choose
appropriately understood pictorials in order to convey the intended meanings. It is also
recommended that the designers to confirm the meaning of the pictorials with the target
population and to use them prudently.
The other results suggested that there was a significant difference in the comprehension
rates of the 12 pictorials between the Chinese and English subjects and indicated that the
mean of comprehension rates of the Chinese subjects was significantly lower than that of the
English subjects. Nevertheless, this result concurred with the findings of an Australian study
of Cairney and Sless (1982) that the results of the first recognition test of the Vietnamese
participants were significantly lower than those of the other groups.
In summary, no significant difference was found in the understandability between the
Chinese subjects and the English subjects for the eight commonly used signal words but there
was significant difference in understandability for the other less commonly used signal
words. The other results suggested that the Chinese subjects did perceive similar connoted
hazard levels from the set of signal words as the English subjects. The other findings of the
study suggested that both the comprehension rates and understandability ratings of the 12
pictorials of the Chinese subjects was significantly lower than those of the English subjects.

References
Cairney, P. and Sless, D. 1982, Communication effectiveness of symbolic safety signs with
different user groups, Applied Ergonomics, 13, 91–97.
Casey, S M. 1993 Set Phasers On Stun and Other True Tales of Design, Technology and
Human Error. (Aegean Publishing Company)
Dunlap, G.L., Granda, R.E. and Kustas, M.S. 1986, Observer perceptions of implied hazard:
Safety signal words and color words. Research Report No. Tr00.3428
FMC Corporation 1985, Product Safety Sign and Label System FMC Corp., Santa Clara
Leonard, S.D. and Karnes, E.W. 1993, Development of warnings of resulting from forensic
activity, Proceedings of the Human Factors Society 37th Annual Meeting 501–505.
Human Factors and Ergonomics Society, Santa Monica
Wogalter, M.S. and Silver, N.C. 1990, Arousal strength of signal works, Forensic Reports,
3, 407–420.
Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted strength and
understandability by children, elders and non-native English speakers, Ergonomics,
38(11), 2188–2206.
VERBAL PROTOCOL
ANALYSIS
THINKING ABOUT THINKING ALOUD

M.J. (Theo) Rooden

School of Industrial Design Engineering


Delft University of Technology
2628 BX Delft, the Netherlands
e-mail: m.j.rooden@io.tudelft.nl

In this paper the possibilities and limitations of thinking aloud, as a


technique to elicit perceptions and cognitions during everyday product
usage, are discussed. Findings from the literature are compared to
experiences with the application of thinking aloud in users’ trialling
conducted at TU Delft.

Introduction
In users’ trialling, users in action are observed to gain information to improve the design at
hand. Subjects may be asked to use existing consumer products or design models, ranging
from rough sketches to working prototypes. Observations of user activities are the most
important data from users’ trialling. An overview of interaction difficulties can help designers
focus on certain aspects of the design. However, when information is available on users’
perceptions and cognitions at the moment of experiencing interaction difficulties, users’
trialling can really be a design tool, since causes of difficulties may be traced. Users’
perceptions and cognitions are as a rule not directly observable. The most direct method to
elicit perceptions and cognitions is called thinking aloud (TA), which means that subjects are
asked to verbalise their thoughts concurrently, while using a product.
When pros and cons of TA in users’ trialling are discussed, Ericsson and Simon (E&S) are
often referred to, as proponents of TA. However, E&S (1993) regard TA as a formal method,
with a lot of rules and restrictions. More detailed insight in their views on TA is required to
assess possibilities of TA in users’ trialling. In this paper, E&S’s basic considerations are
presented. These views are then discussed in the context of users’ trialling both in general and
with regard to specific experiences with the application of TA in an experimental study. This
leads to some proposals on how to benefit from the methods of TA in users’ trialling.
Thinking about thinking aloud 329

Thinking aloud (considerations of E&S)


Techniques of TA originated in cognitive psychology, and they are often applied to get insight
in thought processes. TA and derived techniques are extensively described by E&S (1993).
Results of empirical research on TA are discussed in the frame of an information processing
theory. Cognitive processes are regarded as sequences of internal states successively
transformed by a series of information processes. Information is stored very shortly in
sensory stores, then in short term memory (STM) with limited capacity and intermediate
duration and long term memory (LTM) with very large capacity and relatively permanent
storage. Ericsson & Simon state that only the contents of STM can be verbalised. In
automated behaviour no information is heeded in STM, therefore, there is nothing relevant to
be verbalised. When someone is writing a letter for instance, he or she is not thinking about
the pen, and the way it is manipulated, but is probably thinking about the contents of the
letter. A verbal report will not reveal information about usage of the pen. Only tasks which
require some form of problem solving accommodate for relevant TA.
Information from thought processes available in oral form are easiest to verbalise, because
this can be done directly. In other cases, translation from thought to verbalisation is necessary.
The vocabulary available may not always be adequate to express for instance pictorial information
and manipulations (E&S, 1993, p92) in detail. In research into thought processes it is important
not to disturb these thought processes by TA. E&S (1993, p79) argue that asking subjects for
explanations concurrently changes these processes. When E&S’s requirements for successful
TA are met, verbal protocols may capture a considerable part of the thought process. In those
cases a formal analysis is suggested in which the protocols are segmented and the segments are
categorised. An ‘ideal’ retrospective report is given by subjects immediately after completion of
the task, with much information still in STM (E&S, 1993, p19).

Thinking aloud in users’ trialling

E&S’s considerations and users’ trialling


It is clear that TA in users’ trialling is performed under conditions far from ideal to get ‘rich’
protocols, especially when usage of everyday products is investigated. Using consumer
products consists partly of automated behaviour, and actions, such as manipulations may be
difficult to verbalise in detail. Studying human computer interaction probably benefits more
from TA, because then more problem solving may be taking place, and more information is
already in oral form.
It is expected that thinking aloud when using everyday products may reveal little about the
thought processes at the moment. Therefore it is misleading to justify application of TA in
users’ trialling by simply referring to Ericsson and Simon.

Different aims of TA in users’ trialling


In users’ trialling aims of TA are different from, and possibly much more modest than
charting thought processes. Having subjects think aloud may help to learn which information
from a product is used in what way. The aim is not to get ‘rich’ verbal protocols which can be
analysed in a formal way. Verbal reports are additional to data from direct observation of use
actions. Each utterance of information is valuable, and spontaneous verbalisations of
330 MJ Rooden

perceptions are welcome as well. Unless subjects are forced to think aloud, there seem not to
be large disadvantages of applying TA, although one never knows for sure whether TA does
not interfere with carrying out the task. Information on usage elicited retrospectively does not
necessary reflect thoughts during usage, especially when subjects are asked for explanations.
In many cases, however, it can help understand what happened during usage.

Experiences with thinking aloud in users’ trialling


In an experiment TA was applied to elicit users’ perceptions and cognitions. Aim of the
experiment was to investigate possibilities for users’ trialling early in design processes. In
this paper the focus is only on experiences with TA in this experiment. Subjects (age 20 to 78)
operated a non-professional blood pressure monitor. They were asked to think aloud
concurrently in a casual way. Information was also elicited retrospectively. Some of the
characteristics of the verbal reports, which are illustrated by an extract of one of the protocols
(figure 1), are discussed.

Figure 1. Extract from a verbal report.

No information on skilled behaviour


Usage of everyday products is expected to consist largely of skill- and rule-based behaviour
(Rasmussen, 1986, Kirlik, 1995). Also in usage of the blood pressure monitor, parts of the
interaction consisted of automated behaviour, for instance pushing a button to switch the
Thinking about thinking aloud 331

monitor on, and using a Velcro fastening. No detailed information was supplied about these
actions, as could be expected.

Subjects remain silent for longer periods


There was large variety between the amount of subjects’ verbalisations. However, most
subjects remained silent for longer times. The fact that subjects remained silent can not
totally be explained by the fact that skilled behaviour and manipulations are non-verbalisable.
Subjects may have had other thoughts at the moment. Maybe they judged these thoughts
irrelevant for the research (Wright, 1980; Dobrin, 1986). This may have been the case while
waiting for the results to appear. Maybe they simply forgot to think aloud.
Subjects may also have remained silent, because they are ‘bad’ verbalisers. In more
formal TA it is advised to throw away poor verbal reports, to select subjects with good verbal
capacities, and to train subjects to think aloud. It is also advised to remind subjects to think
aloud when they remain silent. We refrained from this all, because we did not want these
means to interfere with carrying out the task. We only intervened when the interaction got
stuck.
It might be expected that subjects would start verbalising when getting in trouble (Bowers
and Snyder, 1990; Ohnemus and Biers, 1993). However, this was not often the case, apart
from general remarks, like ‘I don’t understand’. Maybe they focused all their mental
capacities to solve the problem (Page and Rahimi, 1995) and do a good job, starting to
verbalise again when getting out of trouble.

Mainly procedural information


Most verbalisations consisted of verbalised actions. As this is often observable information,
such verbalisations are not very helpful (except when a drawing of a design proposal is
presented to the subjects), and they prevent subjects from really verbalising thoughts at that
time. Sometimes subjects verbalised what they perceived, they read product graphics aloud
for instance. This is valuable information from TA. The fact that manipulations are difficult to
verbalise is not a serious problem, when these manipulations can be directly observed.
Bowers and Snyder (1990) and Ohnemus and Biers (1993) also found that concurrent TA
mainly yields procedural information. They prefer retrospective probing, as this yields more
design relevant information. However, retrospective information is not questioned by them.

Retrospective information to be treated with caution


Subjects commented afterwards on a video viewing of their usage, and we asked questions.
Sometimes actions were contradicted, for instance the order of applying the cuff and
switching the monitor on. Sometimes subjects came up with a list of more or less plausible
reasons for certain actions. Presumably, they tried to justify their actions. Negative comments
may be more useful. When a subject is asked ‘did you see the sticker with instructions on the
cuff?’, he or she will probably not answer ‘no’ when in fact he or she did see it. However,
some subjects may hesitate to admit having overlooked the sticker.
332 MJ Rooden

Discussion
From E&S’s views it might be concluded that in users’ trialling spontaneous, concurrently
expressed remarks are most valuable, and at the same time that the verbal reports will be
poor. The verbalisations seem to be only traces of thought processes, which consist of much
more than can be voiced. Retrospective techniques, which include interviewing, should be
applied with caution, because retrospective reports may not reflect thoughts at the moment of
task performance.
The goals of TA in a design context differ from E&S’s goals of TA. Verbal reports labelled
as useless by E&S, may be very useful for designers. Techniques of TA may be modified.
Although retrospective reports may suffer from memory processes, they should not be put
aside altogether, as these techniques are supposed to yield design relevant information.
There are a few alternative methods to elicit perceptions and cognitions, mainly developed
in industry. However, these techniques bring along other uncertainties. In co-discovery
(Kemp and van Gelderen, 1996), subjects working in pairs, conventions of conversation may
interfere with expressing thoughts. Question asking protocols (Kato, 1986) obfuscate regular
usage. Such techniques may be beneficial in specific cases, respectively to inspire designers,
and to find out when users need help from a manual or a help-desk.

Literature
Bowers, V.A. and Snyder, H.L. 1990, Concurrent versus retrospective verbal protocol for
comparing windows usability, Proceedings of the Human Factors Society 34th Annual
Meeting, 1270–1274
Dobrin, D.N. 1986, Protocols once more, College English, 48, 713–725
Ericsson, K.A. and Simon, H.A., 1993, Protocol Analysis. (MIT press, Cambridge MA).
Kato, T. 1986, What “question-asking protocols” can say about the user interface, Int. J.
Man-Machine Studies, 25, 659–673
Kemp, J.A.M. and van Gelderen, T., 1996, Co-discovery exploration: an informal method for
the iterative design of consumer products. In W.Jordan et al. (ed.) Usability evaluation
in industry. (Taylor & Francis, London), 139–146
Kirlik, A. 1995, Requirements for psychological models to support design: toward ecological
task analysis. In J.flach (ed.) Global perspectives on the ecology of human-machine
systems. (Lawrence Erlbaum Associates, Hillsdale), 68–120
Ohnemus, K.R. and Biers, D.W. 1993, Retrospective versus concurrent thinking-out-loud in
usability testing, Proceedings of the Human Factors Society 37th Annual Meeting,
1127–1131
Page, C. and Rahimi, M. 1995, Concurrent and retrospective verbal protocols in usability
testing: is there value added in collecting both?, Proceedings of the Human Factors
Society 39th Annual Meeting, 223–227
Rasmussen, J. 1986, Information processing and human-manchine interaction: an approach
to cognitive engineering. (North Holland, Amsterdam)
Wright, P. 1980, Message-evoked thoughts: Persuasion research using thought verbalization,
Journal of Consumer Research, 7, 151–175
ADJUSTING THE COGNITIVE WALKTHROUGH USING
THE THINK-ALOUD METHOD
A case study on detecting learnability problems in software products

Marjolijn Verbeek and Herre van Oostendorp

Cap Gemini Nederland B.V. Utrecht University


Methods and Tools Department of Psychonomics
P.O.Box 7525, NL-3500 GN Utrecht Heidelberglaan 2, NL-3584 CS Utrecht
MVerbeek@inetgate.capgemini.nl H.vanOostendorp@fsw.run.nl

The aim of this study was to analyze the sensitivity of the cognitive
walkthrough method and to construct an improved version of the
walkthrough question form by adjusting it with the assistance of the think-
aloud method. These two methods were applied to evaluate the ease of
learning of a graphical user interface. It appeared that the think-aloud
method had additional value over the cognitive walkthrough method,
because more different learnability problems were detected, at least for
novice users. The results of the two methods were integrated in a new,
adjusted cognitive walkthrough form.

Introduction
Several contributions have been dedicated to the cognitive walkthrough method since it was
launched (e.g. Wharton et al, 1992). This study had the aim of contributing to the
effectiveness of the cognitive walkthrough in detecting learnability problems in software
products. The walkthrough method focuses primarily on ease of use for first-time users. In
other words, the cognitive walkthrough assesses a user interface for support of learning by
exploration. Exploration-supporting interfaces will be of growing importance as the number
of different system applications increases as well as the population of end-users who often
have received no formal training. Ease of learning is therefore an important aspect of a usable
software product.
Three categories of learnability can be distinguished. The first category involves support
of task-driven user-events, while the other two indicate the degree of exploration-support of
an interface design: 1. The ‘task to action mapping’ of the interface object involved (the
object provides insight in the situation in which it can be used because it provides cues
compatible with the users’ goal); 2. The ‘name to effect mapping’ of the interface object
involved (the objects’ label or cues on the screen provide a clear indication of its function);
And 3. the ‘affordance’ (the user directly perceives how to operate on the object, e.g. to drag
and drop) (Draper & Barton, 1993). The present study applied the cognitive walkthrough
(Lewis et al, 1990) and think-aloud method (Ericsson & Simon, 1993) to detect learnability
problems of the three types mentioned above.
334 M Verbeek and H van Oostendorp

The Experiment
Twenty end-users have taken part in this study. Half of this group were novice users and the
other half were experienced users; both groups were randomly divided over the two methods
in a between-subjects design. Novice users had domain knowledge (planning, resource
allocation, time scheduling, etc.), but they did not have the opportunity yet to bring it into
practice, while they were experienced with Microsoft Windows® applications. Experienced
users also had to have practical experience with the product.
Both the walkthrough and the think-aloud method evaluated the same graphical user
interface of a fully operating system, Project Workbench® PMW (short: PMW). PMW is
marketed by ABT Corporation and provides a project scheduling, tracking, reporting and
analysis capability for managing a wide variety of project environments, from small
maintenance activities to multiple, complex projects and programmes.
Four tasks were evaluated, which were representative for the system. Two tasks were
performed applying one of the two methods whereby the second task is a more complex task
than the first one. The other two tasks were control tasks; task performance was here
measured in order to be sure that the two groups were comparable.

Cognitive Walkthrough Method


In order to map the mental processes, we needed to know what goals and which action repertoir
were at the users’ disposal. Next, to assess the learnability, the user goals and actions were compared
to the formal goals and actions, Table 1. Cognitive walkthrough question form
i.e. goals and actions required by
1. Describe your immediate goal:
the interface in order to
2. The (first/next) atomic action you take:
successfully perform a task. The 2a. Is it obvious that action is available? Why?/why not?
cognitive walkthrough is a 3. Is it obvious that action is appropriate to goal? Why/
theoretically structured evaluating why not?
process that takes the form of a list 4. How do you associate the description with action?
4a. Problem associating? Why/why not?
of questions (see Table 1; Lewis
5. Are all other available actions less appropriate? For
et al, 1990). The evaluation of each, why/why not?
each step in a task involves the 6. How are you going to execute action?
subjects answering the questions 6a. Problems? Why/why not?
while interacting with the 7. Execute the action. Describe system response:
7a. Is it obvious that progress has been made toward goal?
interface. The subject begins by
Why/why not?
giving a description of the current 7b. Can you access needed information in system
goals and the next action response? Why/why not?
(questions 1 and 2). The next 8. Describe appropriate modified goal, if any:
series of questions (questions 2a 8a. Is it obvious that goal should change? Why/why not?
8b. If task completed, is it obvious? Why/why not?
through 6) evaluate the ease with
which the subject will be able to correctly select that action and execute it. Next, a description of
the system response is given, as perceived by the user (question 7). Questions 7a and b evaluate
the adequacy of the system response. And the final questions (question 8 to 8b) evaluate the user’s
ability to form an appropriate goal for the next action or to detect that the task has been completed.
The ‘task to action mapping’ is inferred from the answer on question 3. The ‘name to effect mapping’
of the object mentioned by the subject at question 2, is inferred from the answer on question 4.
The ‘affordance’ is inferred from question 6, where the user explains the interaction with the
object mentioned at question 2.
Adjusting the cognitive walkthrough using the think-aloud method 335

Think-aloud Method
Via the think-aloud method insight is obtained into the user’s thoughts during task performance
(see for example De Mul & Van Oostendorp, 1996). With this insight one can find out where the
user’s attention is drawn to while interacting with the interface. Analysis of the verbal protocols,
then, makes it possible to determine whether there’s a good fit between what users want and can
do and what the system requires and provides on possibilities and feedback. From the
verbalizations of the subjects the interaction with each interface object was judged in terms of
problems resulting from one of the three learnability categories…

Results

Data Analysis
In order to investigate the learnability problems detected by the cognitive walkthrough
method the goal-action structures of users were mapped on the goals and actions which are
required or supported by the interface. From this comparison we were able to determine what
percentage of users did find the solution path, i.e. the sequence of formal goals and actions
required by the design of the interface. Also the amount of deviating goals and actions was
analysed. Every deviating user action was judged as performed on the ground of either the
‘task to action mapping’, the ‘name to effect mapping’ or the ‘affordance’ of the interface
object involved. In sum, the three categories of learnability were inferred from the answers on
the corresponding questions, as mentioned earlier.
Table 2. Median proportions of
In this way, each interface object on which an
general success for every object
action other than the one of the solution path was
performed, could be identified as caused by one of
the three types of learnability problems. An object
was judged as problematic if less than 60% of the
users found the formal action belonging to the
solution path.
In preparation of the data analysis of the verbal
protocols a list of (totally 23) PMW objects was
composed, which are to be used in order to perform
the tasks successfully. The verbal protocols of the
subjects could then be segmented in accordance to
these objects. Every verbalization actually relates
to a planned or realized action of the subject with
regard to an object, thus can be seen as a ‘user-event’
(i.e. an attempt to learn by exploration or to find a
suitable object). We time-stamped each user-event
and cross-referred this with the object in question.
Then followed a binary decision for each of the three
categories of learnability: 1. Is the object found when
it is needed (‘task to action mapping’)?, 2. Is the
object understood when explored with (‘name to
effect mapping’)? and 3. Is the object successfully
operated when used (‘affordance’)? All user-events
336 M Verbeek and H van Oostendorp

were listed chronologically for each object of the interface (Draper & Barton, 1993). We obtained
a success-failure proportion with values between 0 and 1, by dividing the number of successes
by the sum of successes and failures. This figure reflects the general success of exploration of
that object per subject, where a value of 1 means that the object involved has been successfully
used. Then, for the total group of subjects the median value of the general success figure was
computed for every interface object. This value indicates the proportion of general success of
the total frequency with which an object was used. The median was chosen as statistic because
it suites with the quantitive comparison of the two sets of data. In Table 2 one part of the scores
scheme is shown (see for details Verbeek, 1997). Table 2 shows, for example, that the object
TAR (number 14, with the user-event “allocate a resource to the task”) was not used very
successfully. It especially scores very low on the category ‘name to effect mapping’ as well as
on the category ‘affordance’. This means that the involving interface object causes learnability
problems, in the sense that the object did not show enough what it does and also the user could
not perceive directly how to operate on it. The results shown in Table 2 seem to imply that user-
events with an exploration-driven character are weakly supported by the interface. This is in
contrast to the task-driven user-events, because there are hardly ‘task to action mapping’ problems
found.

The Comparison between Methods


With the cognitive walkthrough and the think-aloud method learnability problems were
detected and identified as being related to one of three categories: ‘task to action mapping’,
‘name to effect mapping’ or ‘affordance’ of an interface object. These problems refer to the
interface objects indicated as problematic by the cognitive walkthrough method (used
successfully by less than 60 percent of the subjects) and by the think-aloud method (used
successfully by less than 50 percent of the subjects). The cognitive walkthrough method
detected learnability problems mainly of the category ‘task to action mapping’, which was
true for both groups of subjects. As opposed to this, the think-aloud method detected
learnability problems of all three categories: ‘task to action mapping’, ‘name to effect
mapping’ and ‘affordance’ of interface objects. These problems were specifically
experienced by novice users. While interpreting the conclusions one has to keep in mind that
this study was specifically aimed at detecting learnability problems. Therefore, no absolute
conclusion can be drawn about the usability of PMW.
From these results we may conclude that the think-aloud method had additional value over
the cognitive walkthrough, because more different problems were detected, at least for novice
users, involving particularly the exploration-driven user-events. The learnability problems as
detected by the think-aloud method but not detected by the walkthrough, were seen as “missing
values” in the efficiency of the cognitive walkthrough method. The results of the two methods
were integrated in a new, adjusted cognitive walkthrough form, with which all three types of
learnability problems can be detected. The adjustment was made primarily by modifying the
questions 4 and 7, with which respectively ‘name to effect mapping’ and ‘affordance’ are
measured. Secondly, the first couple of questions were task oriented and the next couple
exploration oriented. Placing question 4 at the beginning of the form, will consequently allow
the user to first observe and explore the interface more thoroughly before selecting an action.
This way, a more exploration oriented task performance is simulated and more emphasis is
given to the ‘name to effect mapping’ and ‘affordance’ of interface objects.
Adjusting the cognitive walkthrough using the think-aloud method 337

Discussion
Usually, the cognitive walkthrough is applied by the designer of the evaluated system or by an
expert in cognitive psychology. The expert walks through the interface design, simulating the
user’s interaction with the interface while performing a specific task. In the present study
however, the cognitive walkthrough is performed by the subjects themselves. This difference
had the advantage that we could derive data directly from representative users instead of
experts merely estimating the user’s goals and actions.
The combination of the two methods can detect a broader range of learnability problems
than they do individually. This case study demonstrated that the think-aloud method was
needed to develop an adjusted cognitive walkthrough, because the think-aloud method
detected problems that the walkthrough did not detect. We believe that the main cause of this,
is that there’s a close correspondence between the verbalizations and the actual processes
used to perform the task. In spite of the verbalization’s lack of coherence and partial
incompleteness, like leaving unanswered e.g. how the solution was generated in detail and
why a given action structure was adopted among many possible scenarios, it yet provides a
more accurate picture than the interrogatively way of the cognitive walkthrough. The subjects
who performed the think-aloud method were focused on completing the task, the verbalizing
of the heeded information being secondary. The subjects who performed the walkthrough,
however, paid relatively more attention to the question form than to completing the task. Of
course, the new version of the cognitive walkthrough should be tried out in order to test its
efficiency empirically.

References
De Mul, S. and van Oostendorp, H. 1996, Learning user interfaces by exploration, Acta
Psychologica 91, 325–344
Draper, S.W. and Barton, S.B. 1993, Learning by exploration, and affordance bugs, Adjunct
Proceedings of INTERCHI ’93 Conference, April 24–29 1993, (ACM, Amsterdam,
The Netherlands), 75–76
Ericsson, K.A. and Simon, H.A. 1993, Protocol analysis. Verbal reports as data, Revised
edition, (MIT Press, Massachusetts)
Lewis, C., Polson, P., Wharton, C. and Rieman, J. 1990, Testing a walkthrough methodology
for theory-based design of walk-up-and-use interfaces, Proceedings of CHI ’90
Conference, April 1–5 1990, (ACM, Seattle, Washington), 235–242
Verbeek, M.L. 1997, Adjusting the cognitive walkthrough method with the assistence of
protocol analysis. A case study of detecting learnability problems in a
softwareproduct, Dutch Internal Report, (Utrecht University, The Netherlands)
Wharton, C., Bradford, J., Jeffries, R. and Franzke, M. 1992, Applying cognitive
walkthroughs to more complex user interfaces: experiences, issues, and
recommendations, Proceedings of CHI ’92 Conference. May 3–7 1992, (ACM,
Monterey, California), 381–388

Acknowledgements
Thanks are due to ABT Benelux B.V. for providing the means for conducting this study.
Especially Dr. Maarten Bakker for his enthusiasm during the study and for his useful
comments to a previous version of this paper. In addition, we thank the participating persons
for their cooperation on the experiment.
VERBAL PROTOCOL DATA FOR HEART AND LUNG
BYPASS SCENARIO SIMULATION “SCRIPTS”

Joyce Lindsay* and Chris Baber

Industrial Ergonomics Group,


School of Manufacturing & Mechanical Engineering,
University of Birmingham, B15 2TT, United Kingdom.

A perfusionist is the medical professional who operates the heart and lung
bypass circuit during open heart surgery. The perfusion circuit pumps
oxygenated blood to the patient’s body tissues while the heart is temporarily
arrested. Currently, perfusionists are given practical training during real
surgery cases. This paper discusses how the current training regime in the
UK could be improved to help trainees become more competent. A small
part of more extensive research towards the development of a training
simulator, this study uses verbal protocol data, real-time parameter
monitoring and critical incident technique (CIT) data to form a database of
perfusion scenarios. With extensive data, this will eventually be transformed
into “scripts” from which simulations will be run.

Introduction
In cardiopulmonary bypass (CPB) surgery the heart is temporarily arrested to allow surgical
repair. Meanwhile, the patient’s tissue oxygenation is maintained by the perfusionist who
also manages a number of other physiological parameters to maintain an optimal internal
environment for the patient. Fully qualified perfusionists operate the circuit on their own.
Trainee perfusionists, on the other hand, operate under the supervision of a qualified
professional. While the complex nature of perfusion means on-the-job training is an
agreeable method of training, the critical nature of perfusion suggests that it is not. Because
the heart surgery team (i.e. surgeons plus perfusionists plus anaesthesiologists) is tightly
coupled, trainees learning during clinical cases may contribute risk to the patient in three
ways. Their own error may directly affect the patient; their error may influence the surgeon or
the anaesthesiologist, causing them to make an error; the surgeon or anaesthesiologist may
take an action which causes the perfusionist to make an error.
With the advance of medical technology there has been a corresponding rise in the
incidence of operator errors (Cooper et al, 1984). Bogner (1994) attributes this in part to a
lack of task specific training in the medical domain, to which perfusion is no exception. The
training of UK perfusionists is currently regulated by the European Board of Cardiovascular
Perfusionists (EBCP). Trainee perfusionists learn theoretical basics before they begin on-the-
Verbal protocol data for heart and lung bypass scenario simulation “scripts” 339

job training in which they must complete 100 clinical cases and be considered as being of a
suitable standard before they can be assessed (Davis, 1996). The EBCP are currently
introducing the European Certificate of Cardiovascular Perfusion (ECCP) to create a uniform
training standard of perfusion across Europe and phase out National qualifications (Sanger,
1997). Candidates for the ECCP must satisfy certain criteria.

• They must have graduated from an accredited institution.


• They must have practised clinical perfusion for at least two year.
• They must be currently practising clinical perfusion.
• Their supervisor(s) must confirm the following:
- That the applicant has a minimum of 100 clinical cases and that they are competent to
practice unsupervised.
- That the applicant can competently avoid and manage perfusion accidents.
- That the applicant can set-up and operate a wide range of equipment used for CPB
(EBCP, 1997).

Broadly, these criteria can be both quantified and assessed against specific standards.
However, the criterion “that the applicant can competently avoid and mange perfusion
accidents” could prove problematic to assess for the following reasons. Since critical
incidents and failures in CPB are rare (Wheeldon 1981, in Longmore 1981), it is likely that a
trainee will first encounter a critical incident as a fully qualified perfusionist. Even if a trainee
experiences a critical incident, the supervisor will assume responsibility. In either case, the
supervisors cannot claim that the trainee can competently avoid and manage perfusion
accidents. If they cause a critical incident then they have not managed to avoid them and if
they managed to avoid them they have not shown that they can cope. It is suggested that an
additional technique is required to improve training as inadequate training usually manifests
itself in critical incidents (Weinger and Englund, 1990).
Botney et al (1993) demonstrated that very few anaesthesiologjsts could deal with
simulated critical incidents in the manner most likely to reduce risk to the patient. Like
perfusionists, anaesthesiologists rarely experience critical incidents and have not been
specifically trained to deal with them. A similar phenomenon may therefore exist among
perfusionists. A training simulator would provide the opportunity to experience and practise
routine and critical scenarios outwith clinical cases. The aim of this research is to produce a
training simulator for perfusion to provide such training opportunities.

• Introduction to perfusion techniques.


• Recognition of and response to critical incidents.
• Recognition of monitor artefacts.
• Practise in operating real to acquire skills.
• Provision of greater trainer control, for example by allowing repeated practise of one
process until it has been mastered.
• To test trainee progress.

Thus, a data collection technique was employed which elicited enough detail to develop
“scripts” for perfusion simulation without disrupting the perfusionist.
340 J Lindsay and C Baber

Methodology
The methodology was divided into two main phases: verbal protocol (VP) and temporal
physiological parameter recording. Five perfusionists were asked to provide concurrent VPs
during surgery. In other words, to provide a running commentary about their actions and
associated reasoning. The data was recorded as non-intrusively as possible, using a simple
arrangement of a tie-pin microphone and a walkman. This method recorded background
noises which often provided clues as to the stage of surgery or to what the perfusionist was
referring if their explanation was not entirely clear. Carried out during clinical cases, each
operation was recorded for twenty to forty five minutes and included putting the patient onto
bypass; bypass itself and bringing them off bypass. The analyst remained present throughout
so that the perfusionist felt that they were talking to someone rather than to themselves.
Concurrently, the analyst recorded the values of the most important monitors in the
perfusion circuit (Lindsay, Baber and Carthey, 1997) at two minute intervals. The data was
recorded onto prepared charts.

Results
The data from the VP were fully transcribed and separated into discrete tasks. Analysis
involved checking the transcripts for those events in the circuit which prompted the
perfusionists to take certain actions. Examples of these are given in table 1. These events then
were matched against the parameter values recorded by the analyst to determine the value(s)
which prompted particular action(s). For instance, the perfusionist may have commented that
“patient blood pressure was high” so he would “increase the isoflurane administration
levels.” The charts indicated what a “high blood pressure value” was and by how much the
isoflurane administration was increased.
From an extensive collection of VP and parameter data it is hoped to create a database of
information from which simulation “scripts” can be formulated. The scripts will include
routine scenarios and critical events where the perfusionist may have to problem solve. Such
scenarios will be identified by a critical incident technique (CIT) which is already underway
and the future implementation of failure modes and effects analysis (FMEA). Each script will
contain contextual information (i.e. patient and operation details) because this may influence
the response adopted by the perfusionist. In one example, when the patient had just been put
onto bypass, the perfusionist commented that the venous oxygen levels were high which
indicated that the patient was not receiving enough oxygen. At a cooler temperature, however,
this would have been an acceptable level of venous oxygen. Such examples suggest that
perfusionists maintain excellent situational awareness, noting the suitability of physiological
parameters as they vary throughout surgery.
In the VP study, to date, only one critical incident has been witnessed by the analyst. The
perfusionist has to calculate the width of circuit lines from patient details and a calculation
chart. Because this chart is difficult to use, the perfusionist inadvertently chose a venous line
which was too big for the patient. This was only discovered when the patient was on bypass,
the venous line began returning a lot of air to the reservoir. The patient came to no harm but
had anything else gone wrong, they could have been injured.
Verbal protocol data for heart and lung bypass scenario simulation “scripts” 341

Table 1: Verbal Protocol; Examples of Action & Response Scenarios

Discussion
Training simulation is “the exercise of the operator on a mimic of the condition in which the
subject will perform their work.” (Stammers, 1983) This need not be a high fidelity
arrangement since effective training can be obtained using limited fidelity (Rolfe and Waag,
1982). It is envisaged that the initial simulation will involve a static representation of the
system to which the perfusionist is expected to respond. The scripts developed from the VP
and parameter data will direct the course of a simulation. The scripts will adopt the following
format.

• The perfusionist will be given contextual information (i.e. the type of surgery; the stage of
surgery and patient details).
• The static representation will include a set of physiological values which will be
temporally updated and to which the perfusionist must interpret the situation and outline
their responses.
• The data from the verbal protocols will be compared with the perfusionists’ responses.
• Eventually, more challenging circumstances will be included where the information is
incomplete or there are unsafe physiological values. Such scripts will be derived from the
CIT and FMEA data.
• Finally, the activities of other members of the surgery team must be included as the
perfusionist does not operate the circuit in isolation. Team activities and communication
will therefore be studied and incorporated into the scripts.

It is impossible for the perfusionists to verbalise every thought and action involved. Thus,
much more VP and parameter data will have to be recorded to create a substantial database.
This data will eventually be combined with CIT and FMEA data to include a wide range of
failures in the scripts.
342 J Lindsay and C Baber

Conclusions
1. That certain physiological parameter values trigger specific responses by the perfusionist
and that the latter are governed by the context.
2. That perfusionists maintain an accurate awareness of the situation around them.
3. That VP is an excellent means of data collection in these conditions to give insight into
circuit operation.
4. That recording of physiological parameters every two minutes provides sufficient detail
for the simulation scripts.
5. That many more VP, parameter records, CITs and FMEAs will have to be undertaken to
create a substantial database for script development.
6. That the criteria required for candidates to sit the ECCP are inadequate. 100 clinical cases
will not prepare trainees for all eventualities or provide the opportunity to develop skills to
cope with critical situations.
7. That a simulator run by scenario scripts would be flexible, providing trainer control over
the experiences of the trainee.

References
Bogner, S.; 1994; Human Error In Medicine; Lawrence Erlbaum Associates Ltd.
Botney, R., Gaba, D.M., MD, Howard, S.K., MD; 1993; The role of fixation error in
preventing the detection and correction of a simulated volatile anaesthetic overdose;
Anaesthesiology; 79(3a).
Davis, M.; 1996; Personal Communication; Chief Perfusionist; Great Ormond Street
Hospital for Sick Children, London.
European Board of Cardiovascular Perfusion; 1997; Perfusion Announcement; Perfusion;
12(2); pp. 80.
Lindsay, J., Baber, C. and Carthey, J; 1997; Criticality analysis of potential failure in heart
and lung bypass systems during neonatal open heart surgery; The Principles of Risk
Assessment and Management for Programmable Electronic Medical Systems; 9th December,
1997, Strand Palace Hotel, London.
Rolfe, J.M. And Waag, W.L.; 1982; Flight simulators as flight devices: some continuing
psychological problems; Communications to the Congress of IAAP, Edinburgh; Roneo; 10
pp.
Sanger, K.; 1997; Personal Communication; Chief Perfusionist; The Royal Infirmary of
Edinburgh and the Royal Hospital for Sick children in Edinburgh.
Stammers, R.B.; 1983; Simulators for Training in T.O.Kvalseth (Ed.), Ergonomics of
Workstation Design; London, Butterworths; pp. 229–242.
von Segesser, L.K.; 1997; Perfusion education and certification in Europe; Perfusion; 12;
pp. 243–246.
Weinger, M.B and Englund, C.E; 1990; Ergonomics and human factors affecting
anaesthetic vigilance and monitoring performance in the operating room environment;
Anaesthesiology, 73, pp. 95–102.
Wheeldon, D.R.; 1981; Can cardiopulmonary bypass be safe procedure?; In, D.
Longmore (Ed), Towards Safer Cardiac Surgery.
USE OF VERBAL PROTOCOL ANALYSIS IN THE
INVESTIGATION OF AN ORDER PICKING TASK

Brendan Ryan and Christine M.Haslegrave

Institute for Occupational Ergonomics


University of Nottingham, Nottingham NG7 2RD

Verbal protocol reports were collected from eleven subjects both during and
shortly after completion of familiar handling tasks in their usual workplace.
Additional reports were collected whilst the subjects watched a video
recording of their performance. Supplementary questions were used to
collect further information on factors which were not discussed by the
subjects. The reports give an indication of the range of information which
can be collected from subjects in relation to manual handling tasks, and
showed that it is difficult to obtain detail with regard to posture, movements,
handling techniques and characteristics of the loads handled.

Introduction
Self report methods have been used extensively in practical situations to collect information
relating to accidents and injuries. They have been used with varying degrees of success in the
collection of data on exposure to risk factors associated with back injuries, In this study,
verbal protocol reports were collected to gain a greater understanding of how subjects
approach manual handling situations, and how they perceive, process and report information
relating to manual handling risk factors.

Method
Verbal protocol reports were collected from eleven subjects both during (concurrent protocol)
and shortly after (retrospective protocol) completion of familiar handling tasks in their usual
workplace, which was a distribution warehouse. The tasks involved the transfer of various
items from warehouse racking to a roll container. The largest item handled was a cardboard
box (approximate dimensions 0.5×0.5×0.5m, weight 5kg), which had to be pulled out from a
low level of the racking to pick items of stock which were stored inside.
344 B Ryan and CM Haslegrave

Standardised instructions were read to the subjects, requesting them to think aloud during
the execution of the order picking task. Movements, posture and verbalisations by the subject
during the execution of the task were recorded using video tape. Retrospective protocols were
recorded on audio tape immediately on completion of the task.
Ten of the subjects returned to participate in day two of the study, forty-eight hours later.
A second retrospective report was collected in order to investigate the effect of a delay in
reporting. This was followed by a further report produced whilst watching the video
recording of the task completed on day one of the study.
Finally, supplementary questions were put to subjects.
All recordings were transcribed and tabulated to facilitate comparisons between subjects
and the various methods. Postures and movements during the performance of the task were
viewed on video tape by the experimenter. Written descriptions of these postures and
movements were tabulated alongside the verbal reports from the subjects.

Results
Five of the subjects were only able to provide very general information, with a lack of task
related detail or evidence of poor understanding of instructions. The reports of the other six
subjects were analysed in detail.

Concurrent report
The concurrent report aimed to collect details of thoughts while performing the task. It was
clear that the subjects’ main focus of attention was on the selection and locating of items to
make up the order on the pick list they had been given. Many subjects focused on racking
location numbers and order details which were read from the pick list. Several subjects paid
close attention to the products and attempted to identify and produce verbal labels for these.
However, attention wandered frequently, with some subjects making reference to hobbies,
ambitions or other non-work related activities. Gaps were evident in other reports where
subjects appeared to have found difficulty in producing verbalisations at the same time as
attending to their task. There were very few spontaneous reports of postures or techniques
being used to handle items.

Retrospective reports
The retrospective reports aimed to determine how well subjects could recall thoughts which
had entered their attention during performance of the task. These were repeated two days
later in order to evaluate the effect of a period of delay on the ability to report such thoughts.
Generally, the retrospective reports contained less descriptive detail than the concurrent
reports. The subjects tended to report general objectives or goals of the task (e.g. “picked the
stock”, “put them in the roll cage”). Concurrent and retrospective reports often contained
similar subject matter, but some subjects reordered their thoughts. In addition, new
information on the subjects’ thoughts on their performance of the task was often provided in
the retrospective reports. Thus, it is not possible to determine whether the reports contained
details of actual thoughts at the time of execution of the task, or whether they are based on
reconstructed information.
Use of verbal protocol analysis in an order picking task 345

The second retrospective report showed further loss of information, particularly in


relation to descriptive details of the task.

“Retrospective with video” report


After the second retrospective report, the subjects were again asked to describe their task and
the video recording was used as an additional prompt with regard to the activities carried out.
Details were again requested of recall of thoughts.
Some subjects provided information of a better quality than others, and reports were
produced in close synchronisation with the video images. Others referred to what they
“usually”, “often” or “sometimes” thought during the execution of the task.
These reports appeared to introduce some new information but would be subject to the
limitations relating to the reconstruction of thoughts, identified in the previous sections.

Supplementary questions
The final series of questions was included to collect information which subjects had not given
during their spontaneous reports. The questions addressed the awareness of postures,
movements and handling techniques, and the awareness of characteristics of the loads.
There was little evidence to suggest that conscious attention was devoted to postural risk
factors in this study. The subjects, almost without exception, admitted that they did not think
about the characteristics of the loads when they were of the weight and size used in this study.

Discussion
The reports give a good indication of the range of information which can be collected from
subjects both during and after completion of a typical handling task, and the type of detail
which they spontaneously report without the need to resort to questioning or prompting.
In general, the reports lacked descriptive detail, and contained little or no references to
handling technique, posture or movements. Baril-Gingras and Lortie (1990) refer to handling
activities as a complex combination of different operations, which are rarely one continuous
movement. Subjects do not appear to think about and describe handling activities in this way,
and they fail to provide complete breakdowns of the phases involved in the tasks. The reports
suggest that subjects perceive handling activities in more “global terms”, enabling reports
which are goal orientated or state the general objective of the task. It is not clear whether
subjects have the ability to access information which will allow them to adequately describe
the handling activities. The final interviews failed to extract this information and comments
from some subjects appeared to confirm that they do not have access to such information in
tasks which they describe as “routine” or “automatic”. This might be expected, since Schmidt
(1982) explains the contribution of motor programs to posture and movement patterns. This
evidence of unconscious processing suggests that it would be difficult to use self report
methods for the purpose of collecting information on postures, movements and handling
techniques.
In a similar manner, subjects made very few references to characteristics of the load. Their
reports, and responses to the supplementary questions, suggest that the subjects made sub-
conscious decisions which may not fully account for the potential effect of size, weight or
346 B Ryan and CM Haslegrave

other attributes of the items handled. One of the loads was a significant size and weight, and
was positioned in such a way that the opening flap of the box caused difficulties for the
subjects in removing this from the bottom level of the racking. Several subjects briefly
acknowledged this difficulty but made no other references to characteristics of the load, or to
the problematic postures which they adopted during this handling activity. The weights of
many of the items in the current study were admittedly small and the “size-weight” illusion
described by Wiktorin et al (1996) could be an important factor in the underestimation of the
loads. It is intended to repeat the study with a wider distribution of load sizes and weights in
order to see whether this raises the awareness of both the load characteristics and risk factors
associated with posture and movement patterns.
In the final part of the study, careful consideration was given to the question wording in an
attempt to minimise bias (Sheehy 1981). Sheehy refers to a trade off between attempts to
provide complete verbal reports and the erroneous information which may result from
interrogation, but the responses to the questions have therefore been interpreted with caution.
The questions were located at the end of the experiment to minimise the effect of bias on the
other elements of the study.
When considering the methodology, the limitations of verbal protocol analysis are well
documented in the literature (Bainbridge, 1995; Nisbett and Wilson, 1977; Leplat and Hoc,
1981; Ericsson and Simon, 1993). A number of these methodological problems are of concern
in the current study. Several subjects referred to difficulties putting their thoughts into words
and it is possible that this is particularly a problem in verbalising handling techniques. Ericsson
and Simon (1993) discuss the problems which subjects may have in transforming non-orally
encoded information into speech, so it is perhaps not a surprising finding in this study that the
subjects focused on aspects of the task which may be easier to report.
The retrospective reports gave evidence of revision of accounts of the task, at least in
terms of ordering thoughts, giving the impression that the task had been approached in a more
logical fashion than the former report indicated. Sheehy (1981) warns of the potential for
reconstruction in self reported information, and explains how the subject’s conceptual
recognition of the events may be different from the chronological order of events. Additional
information was obtained from some of the retrospective reports, but it is not possible to
determine whether this was based on the subject’s actual thoughts, or originated from
reconstructed comments which the subject “thinks he must have thought”.
The “retrospective with video” report extracted new information in a number of cases.
These reports may also be susceptible to the effects of reconstruction but it is difficult to
either prove or disprove this.
The widely differing quality of reports from subjects while they were watching the video
recording appears to confirm Schmidt’s (1982) reservations about the abilities of subjects to
critically observe their own performance because of limited “viewing skills” to separate
relevant from irrelevant aspects of the action.

Conclusions
- The reports give an indication of the range of information which subjects process and are
able to describe during the execution of handling tasks and the types of information which are
accessible and inaccessible.
Use of verbal protocol analysis in an order picking task 347

- The reports contain remarkably few references to posture, movements, handling techniques
or factors relating to the load. Subjects did not naturally provide detailed breakdowns of the
stages involved in handling activities. The failure to obtain such information could suggest:
(i) The subjects do not have access to this information
(ii) The ability to report on such factors is not well established. Subjects may have a greater
awareness of these details than may be apparent from the reports. Progress may be achieved
by efforts to improve the ability of subjects to provide reports
In either case, the findings raise questions with regard to the validity of any subsequent
questions which might attempt to collect such information.

- The verbal protocol methodology has been shown to contain many limitations, such as the
difficulties putting thoughts into words and time limitations which prevent subjects from
mentioning all that passes through their minds. These may be particular problems in the
verbalisation of handling techniques.

- Further studies are now necessary to investigate the awareness of posture, movement and
handling techniques when handling tasks involve heavier and different types of loads.

References
Bainbridge, L. and Sanderson, P. 1995, Verbal protocol analysis. In J.R.Wilson and E.N.
Corlett (ed) Evaluation of Human Work. A practical methodology Second Edition,
(Taylor and Francis, London), 169–201
Baril-Gingras, G. and Lortie, M. 1990, Analysis of the operative modes used to handle
containers other than boxes. In B.Das (ed) Advances in Industrial Ergonomics and
Safety II, (Taylor and Francis), 635–642
Ericsson, K.A. and Simon, H.A. 1993, Protocol Analysis: Verbal reports as data Revised
Edition, (The MIT Press, Cambridge, Massachusetts)
Leplat, J. and Hoc, J-M. 1981, Subsequent verbalisation in the study of cognitive processes,
Ergonomics, 24, 743–755
Nisbett, R.E. and Wilson T.D. 1977, Telling more than we can know: Verbal reports on
mental processes, Psychological Review, 84, 231–259
Sheehy, N.P. 1981, The interview in accident investigation. Methodological pitfalls,
Ergonomics, 24, 437–446
Schmidt, R.A. 1982, Motor Control and Learning. A Behavioural Emphasis, (Human
Kinetics Publishers, Champaign, Illinois)
Wiktorin, C., Selin, K., Ekenvall, L., Kilbom, A. and Alfredson, L. 1996, Evaluation of
perceived and self-reported forces exerted in occupational materials handling, Applied
Ergonomics, 27, 231–239
PARTICIPATORY
ERGONOMICS
SELECTING AREAS FOR INTERVENTION

Benjamin L.Somberg

User-Centered Design Department


AT&T Labs
Room 2K-337
101 Crawfords Corner Road
Holmdel, NJ 07733 USA

When seeking to effect organizational change, the determination of which


aspects of the organization are to undergo change should be based on
objective data regarding existing problems in the environment. However it is
unprofitable to seek change that is not supported by the organization and
thus the personal preferences of the stakeholders must be given considerable
weight. A method for balancing the use of objective data and attention to
stakeholder preferences was used to help an organization select an
opportunity for change. This process ensured involvement of stakeholders by
assigning specific participatory roles and it forced stakeholders to agree in
advance on the criteria for selecting an area for change. This produced a
decision that was supported by all stakeholders and ensured that a verifiable
problem within the organization was being solved.

Introduction and Thesis


Attempts to effect organizational change often face the obstacle of a lack of agreement among
stakeholders on which aspects of the organization are the best candidates for change. Clearly
it is productive to address only those issues that represent verifiable problems with cost-
justifiable solutions. However, recommendations based on objective analyses may be
devalued because they are not consistent with management’s view of the existing
environment. Meanwhile, political considerations and anecdotal evidence often influence
stakeholders’ decisions about which of several opportunities for change should be seized.
To overcome the barrier concerning selection of an area for change, two conditions must
be met. First, a decision needs to be based as much as possible on objective data, and second,
there must be a sensitivity to the, possibly, conflicting goals of all important stakeholders. A
recent project offers a successful example of meeting these two conditions through the use of
a “focus area selection process” that was data-driven, but had high stakeholder involvement.
Selecting areas for intervention 351

Focus Area Selection Procedure


This project involved a division within the company that has a critical role in the maintenance
of the public telecommunications network.1 Because of the introduction of new technology in
the network, the role and demands of this organization were under significant transition,
providing an excellent opportunity to investigate ways to enhance the effectiveness and
efficiency of the operations within the division. The organization is a large one with a broad
scope and it was known from the beginning that there were more opportunities to effect
change than could be handled by the resources devoted to the effort. Some selection of
critical areas would have to occur. The types of change that would be considered included
enhancements to the work environment (including support tools), job and task design, and
organizational restructuring.
The project was conceived as a four phase effort. The first phase was a high-level overall
analysis of the operations within the division, concentrating on identifying sources of
inefficiency, sources of error, and opportunities for change. At the conclusion of that analysis,
there was to be a selection of one or more “focus areas” which would receive detailed
attention. In the third phase, an analysis of those focus areas would be performed and a
recommended program for organizational change within the boundaries of those focus areas
was expected. Finally, the fourth phase of the project called for the development of an overall
plan for organizational change, incorporating the lessons learned from the selected focus
areas. In order to help ensure participation across the division, the following participatory
roles were identified:

• Analysis team: a small group of technical personnel with responsibility for performing the
analysis of the operations and for making specific recommendations regarding
organizational change.
• Core team: a group of key managers from the division who periodically reviewed the
project status and made final decisions regarding the overall project direction.
• Stakeholders: a larger group of managers or process owners within the division who
would likely be affected by the results of the project. They played a consulting role on the
project and participated in some key decisions.

It was recognized from the outset that the selection of the focus areas was going to be a
critical step in the project. A considerable effort was to be devoted to a detailed analysis of the
focus area and it was anticipated that significant, concrete recommendations for change
would result. If the focus areas were selected based upon an unbiased understanding of the
benefits and obstacles associated with each potential area, there would be confidence that the
remainder of the project would be devoted to solving verifiable problems and that significant
benefit to the organization could be achieved. However the project had numerous
stakeholders with competing interests and there was potential for territoriality to overwhelm
objectivity. Consequently, it was decided to attempt to reach a priori consensus among the

1
This goal of this paper is to describe a methodology, rather than to discuss the results of an
analysis. Consequently, some of the proprietary details of the involved organizations have been stated in
general terms or altered in ways that do not affect the goal of the paper.
352 BL Somberg

stakeholders on a process for selecting the focus areas. Although the focus area selection
process was to be the second phase of the project, defining that process occurred in parallel
with the high-level analysis.

Step One: Selection Criteria Nominations


The first step in selecting a focus area was for the stakeholders to construct a list of potential
criteria by which focus areas could be selected. Stakeholders were told generally how these
criteria would be used, but were not given any examples or restrictions on what they could
suggest. The stakeholders were asked to generate as many possible criteria as they wished.
Twelve unique selection criteria were offered, as shown in Table 1.

Table 1. Nominated selection criteria

Step Two: Voting on Selection Criteria


Once a set of possible selection criteria had been obtained, stakeholders were asked to help
choose a final set of criteria that would be used to select the focus areas. Stakeholders were
sent a list of the twelve nominated selection criteria and were asked to rank each nomination
as High (very important), Medium (moderately important), or Low (relatively unimportant),
with the constraint that there could be no more than four nominations placed into either of the
two higher categories. Based upon the voting, four criteria were adopted for use in selecting
the focus areas.

Step Three: Generation of Candidate Focus Areas


It is important to note that the nomination and voting on selection criteria were completed in
the absence of any knowledge on the part of the stakeholders about what focus areas were
being considered. This was an intentional effort to prevent stakeholders from using their votes
Selecting areas for intervention 353

on selection criteria to promote the choice of a preferred focus area. Once the set of selection
criteria had been adopted, candidate focus areas could be disclosed. The focus areas were
generated from the results of the high-level analysis of the division’s operations and
represented areas in which the analysis revealed an opportunity to enhance the effectiveness
or efficiency of the organization. Stakeholders were allowed to supplement this list of
candidate focus areas, but no such suggestions were received. A few of the candidate focus
areas are summarized in Table 2.

Table 2. Sample focus areas

Step Four: Rating of Candidate Focus Areas


For each of the four previously-adopted selection criteria, a five-point rating scale was
adopted. Guided by the results of the high-level analysis, the analysis team rated each
candidate focus area on each of the four selection criteria. These ratings were reviewed by the
core team.

Step Five: Selection of Focus Area(s)


Given that a process for selecting focus areas had been agreed to in advance and that the
selection criteria had been selected by the stakeholders, it would have been possible at this
point to tally the ratings for the candidate focus areas and select the one(s) with the highest
overall rating. However it was decided that more would be gained through involving the
stakeholders once again in the selection process, particularly if the options could be presented
to the stakeholders in a manner that would encourage the use of the pre-established process.
By this time all stakeholders had received substantial information, in the form of interim
reports and presentations, about the major results of the high-level analysis. They had
available to them supporting information about each of the candidate focus areas, including
data to indicate the magnitude and effects of each problem area. Core team members were
sent packages that summarized what was known about the candidate focus areas. Specifically
the key findings of the analysis were reviewed along with a table that showed the relationship
between these findings and the candidate focus areas. Core team members were asked
individually to select one or more focus areas based upon this input. This was followed by a
354 BL Somberg

meeting of the core team and the analysis team where their selections were reviewed and a
final determination of a focus area was made.

Results
This process resulted in the selection of a focus area that met a pre-established set of selection
criteria.2 This focus area was also one that had been identified as a source of inefficiency by the
operations analysis and one for which opportunity for significant change had been indicated.
Although one of the core team members was less pleased with the outcome of the
selection process than the others,3 there was unanimous agreement that the process was fair
and objective. The division manager was particularly supportive of the process and often
reminded other core stakeholders that they had agreed to the process in advance. In the end,
all stakeholders supported the decision and planning to perform the work involved with the
selected focus area preceded readily.

Conclusions
The goal of the process described here was to select an opportunity for enacting
organizational change that would solve a real problem and that would be supported by a
significant portion of the affected organization. Achieving this objective required a delicate
balance of sensitivity to the conflicting preferences of the stakeholders and reliance on
objective indicators of true organizational problems. The process contained three major
elements that helped reach this goal.
1. The process endeavored to achieve as much involvement by the stakeholders as
possible, as it is well-known that people who are involved in a process are more likely to be
supportive of the outcome of the process. This was accomplished by assigning roles with
specific responsibilities to members of the organization and by soliciting input across the
organization at each step in the process.
2. The process for selecting a focus area was defined and agreed to by the stakeholders in
advance of any discussion about content. It was assumed that once discussions about content
had begun, stakeholders would find it difficult to engage in process negotiations in an unbiased
manner. As a specific example of this principle, the stakeholders were asked to agree to criteria
that would be used to select a focus area before the candidate focus areas were revealed.
3. Even though care had been taken to achieve a priori consensus on a process for
selecting focus areas, it was assumed that people with strong interest in the outcome of the
process would find it difficult to apply the process objectively and consistently.
Consequently, results were packaged in a way that made it difficult for analysis results to be
ignored and encouraged conformance to the established process. In fact, the candidate focus
area that was ranked highest on the selection criteria was the one that was ultimately chosen
for more detailed analysis.

2
Although the process permitted the final selection to consist of more than one focus area, due
primarily to resource limitations and the complexity of the highest-ranked candidate, only one focus
area was selected.
3
As one might expect, this core team member had rated the selected focus area as a relatively low
priority.
PARTICIPATORY ERGONOMICS IN THE
CONSTRUCTION INDUSTRY

A.M.de JongA, P.VinkB, W.F.SchaeferC

A Delft University of Technology, Faculty of Civil Engineering, Dept. of Building Technology and
Building processes, P.O. Box 5048, 2600 GA Delft, The Netherlands, E-mail
a.m.dejong@bouw.tudelft.nl, Facsimile +31 15 2784333
B NIA TNO, P.O. Box 75665, 1070 AR Amsterdam, The Netherlands
C Eindhoven University of Technology, Faculty of Architecture Building and Planning, Department of
Production and Construction, P.O. Box 513, 5600MB Eindhoven, The Netherlands

Work in the construction industry often results in high physical strain of the
worker, which may be reduced by introducing technological innovations at
construction sites. However, many innovations are not implemented at the
sites in het Netherlands. The concept of participatory ergonomics is a
method to improve the implementation of innovations by involving the target
group during the development.
This paper reports on an evaluation study of two development processes
of innovations for construction sites. The first project for painters is initiated
to explore the working methods to reduce physical strain and involves less
workers as participants than the second project. The second project for
installation workers is initiated by a company that is aimed to develop tools
to support workers at non-changing working methods. Although the projects
differ in participatory approach the goal of both projects have been achieved.

Introduction
The origins of the method of participatory ergonomics have been defined by Noro and Imada
(1991) and involve employees as sources for problem solving and improving product quality
by stimulating their desire to improve organizational effectiveness and quality of work. Three
advantages of this method can be distinguished: (1) integration of human factors at different
organizational departments, (2) efficient use of sources of information and (3) using opinions
of employees to improve quality of work.
In a recent conference the necessity of input from workers was emphasized by different
authors (e.g. Jensen, 1997, Landau and Wakula, 1997, Moir, Buchholz and Garrett, 1997). In
this paper projects will be discussed which were carried out with the method of participatory
ergonomics. In a previous conference paper (De Jong, 1997) the method of participatory
ergonomics of NIA TNO, the Dutch organization for labour issues, has been analyzed. Only a
brief overview of the method will be given here. The projects follow the six steps of the
356 AM de Jong, P Vink, WF Schaefer

method as shown in Table 1. The analysis indicated that for each step of the process different
participatory approaches and groups must be chosen. Also, in some steps participation is
necessary in greater extent than in other steps.

Table 1. Step by step approach (Vink et al., 1995)

This paper aims to analyze the goal of two projects in relation to the participatory approach.
In 1997 two projects in the Netherlands have been carried out with the method by different
project managers. The first was carried out in a large installation company and the second one
is done in cooperation with two middle-sized painting companies. The projects cannot be
compared with each other on outcome of the process, since they were performed under different
circumstances. However, the different goals of the projects can be compared with each other in
relation to the participatory approach and methods and techniques that were used in the process.

Project 1: painters
The project was at first aimed at developing devices to support the painter during standing
activities. However, in the participatory process the problems concerning working conditions
turned out to be more diverse. Therefore, the scope has been broadened and for a number of
important problems of the work, such as reaching and repetitive movements, solutions have
been proposed. The process will be described using the step by step approach as shown in
Table 1. In each step the participatory approach will be outlined.

Step 1
The project was funded by the painters trade organization and was carried out by NIA TNO
and two painting companies. However, the companies have not been involved at the initiation
of the project, but only at the beginning of step 2. Four experts of NIA TNO suggested the
participatory approach and handled the organization and communication of the project. The
project proposal was written by this group and sent to the trade organization for financial
support. After approval, the two companies were asked to join the project group.

Step 2
Video recordings have been made at locations of the two painting companies. This showed
that painters do not only paint; they also have to build scaffolding, transport materials and
equipment and prepare the surfaces for painting. The problems did not only occur during
painting but also, and even more, at the other activities. The painters were asked informally
about their body discomfort and the activities in which it occurred.
Participatory ergonomics in the construction industry 357

Step 3
A solution session has been prepared and organized by students as a special course. Two
executives of one company and experts were present to explain the activities of the painters. A
special technique to improve creativity during such a session was used by a student-chairman
to get as many ideas and solutions as possible. This generated a lot of ideas to solve different
problems, such as the repetitive movements and the reaching and for the painting job. The
ideas have not been worked out further than drawings and texts.

Step 4
The ideas were first categorized to type of problem and prioritized by the three project
members of NIA TNO. Fifteen solutions remained, which were drawn and put in a booklet
with grading forms. The solutions concerned alternatives for scaffolding, alternatives for
painting and preparing activities and alternatives for supporting devices. These booklets were
sent by mail to one company and handed over with explanation to the other. The mailed
booklets were given much lower grades than the booklets that were handed over. However,
both companies graded the solutions to eliminate the painting job very low, since they figured
that this is the principal job they do.
Step 5 and 6 have not yet been carried out. The financial support only extended to the first
four steps of the project. Next year the other part of the process will be requested for financial
support at the trade organization and the two companies both indicated they would like to
cooperate.

Project 2: installation workers


This development project is part of a larger work improving process. This part of the project
was aimed to develop devices to reduce physical strain for three important problems: static
load, kneeling and manual transport. The process will be described as in the previous paragraph.

Step 1
This project was a follow-up of a previous project aimed at improving company culture
towards working conditions. The installation company gave NIA TNO the assignment to start
a developing project for mechanical devices to reduce physical strain for three problems:
manual transport, kneeling and static load. The company already installed a special
committee for working conditions joined by the project manager of NIA TNO. In every step
several experts were asked to contribute to the project.

Step 2
The physical strain for the problems mentioned previously was analyzed by questionnaires
which were carried out by safety and health coordinators. These questionnaires asked the
workers to point out problem activities and indicate the number of times they are performed.
This was a good method to get more information of the circumstances of occurring problems.
The data were set out in graphics to make it comprehensible.

Step 3
Again, a solution session was organized, but this time by NIA TNO in a round table meeting
with workers, safety and health coordinators, chiefs and experts. An expert chaired the session
358 AM de Jong, P Vink, WF Schaefer

and started with some graphics of physical strain to show the importance of the problems. Then
the participants were asked to explain in which situation the problems occurred and what the
cause of the problems was. A better definition of the problem creates better solutions. At first,
solutions were drawn or written down individually, later the results were discussed in the group.
A number of criteria was used in the discussion to prioritize the solutions. The group made a
final judgement based on the necessity of the solution. Among others the results of the selection
concerned transport devices, fixing equipment and supporting devices.

Step 4
Afterwards the committee evaluated the solutions to their effects. Some existing solutions
were found in one department, which could also be used elsewhere. These solutions have to
be diffused into other departments of the company. Other solutions had to be either bought or
developed in cooperation with production companies and NIA TNO.

Step 5
An information day, organized for all safety and health experts of the company, should make
them aware of the existence and availability of the solutions. Information was given on the
safety and health regulations, physical strain and policy of the company. However, most
important was the presence of the prototypes of the solutions. Furthermore, a booklet
containing all solutions and the production company that sells the device was handed out.

The next step is to buy solutions and introduce them in the departments by the safety and
health experts. The project continues this phase now and it will be evaluated by the author.

Differences in participation
In this paragraph some important differences in participation will be shortly reviewed.
• aimed at improving existing situations or creating new working methods
Since project 2 had to produce working results on a short term the working methods as a
concept were not evaluated. The working methods of the painters of project 1, however, were
discussed and evaluated. This resulted in very radical solutions, such as the elimination of the
painting job.
• determining problems with or without workers
Step 2 was carried out in project 1 without questionnaires to the painters. Project 2 did
involve questionnaires. Therefore, focus can be put on problems of painters which are not
actually firstly prioritized by them.
• developing solutions with or without workers and companies
Step 3 involved no painters and just a small number of installation workers. Only one
painting company contributed to the solution session. The installation company gave several
people the chance to contribute to the solution session, which resulted in interesting
discussions between work floor and management.
• feedback to workers
Project 1 gave more feedback to all workers in the company than project 2. The painters were
informed of the solutions by grading them in step 4. Project 2 only involved a small number
of workers, but did not give feedback afterwards to other workers until step 5.
Participatory ergonomics in the construction industry 359

Discussion
The two projects which both use the participatory approach have a very different organization
and use very different methods and techniques. The context of a project often determines the
type of approach that is chosen. If a company initiates the project, they are in great extent
involved in the process. On the other hand, if the trade organization initiates, the developing
company is in the center and more companies are involved which indicates that they do not
‘own’ the solutions. Project 1 had an interesting difference in outcome of the grades by the
two companies. In the first company no verbal explanation was given on the solutions and the
grading methods which had its effects on the grades, compared to the second company. This
could indicate a difference in outcome because of selection techniques.
Also the use of participatory groups differ in great extent. In project 1 merely students are
involved at the solution session, though project 2 only the company is involved. This may be
the cause of the difference in the solutions. Companies cannot as easily as outsiders forget
about daily working methods and therefore generate less radical solutions.

Conclusion
The projects which were discussed in this paper had a very different approach and used very
different techniques. Both projects resulted in the requested outcome. Project 1 was more
radical and changed working methods. Project 2 developed solutions to reduce physical
workload at existing working methods. The participatory approach of project 1 was aimed to
define the possibilities in theory to reduce workload, whereas the approach of project 2 was
aimed to find on a short term practical solutions for problems that actually worked.
Therefore, project 1 involved less workers and management than project 2, because it simply
was not necessary to find good results. Also the methods and techniques are adjusted
exclusively to reach the goal of the project.
Further research will analyze these and other projects to implementation and adoption.

References
Jensen, P.L. 1997. The Scandinavian approach in participatory ergonomics. In: Proceedings
IEA ‘97, Vol. 1, Seppälä et al. (eds.), Tampere, Finland, p. 13–15.
Jong, A.M.de, Vink, P. and Schaefer, W.F. 1997. An evaluation study of the participation in
the development process of innovations for construction sites, In: Proceedings of the
1st South African Conference on Safety & Health on Construction Sites, T.C.
Haupt and P.D.Rwelamila (eds.), CIB W99, Cape Town, 4–9 October 1997, p. 47–56.
Landau, K. And Wakula, J. 1997. Ergonomic design of tools and working objects in the
construction industry. In: Proceedings IEA ‘97, Vol. 6, Seppälä et al. (eds.), Tampere,
Finland, p. 139–142.
Moir, S., Buchholz, B., Garrett, J. 1997. Health Trak: a participatory model for intervention
in construction. In: Proceedings IEA ‘97, Vol. 6, Seppälä et al. (eds.), Tampere,
Finland, p. 151–154.
Noro, K. and Imada, A. 1991. Participatory Ergonomics. Taylor & Francis, London.
Vink, P., Peeters, M., Gründemann, R.W.M., Smulders, P.G.W., Kompier, M.A.J. and Dul,
J. 1995. A participatory ergonomics approach to reduce mental and physical
workload, Int. J. Of Industrial Ergonomics, 15, 389–396.
USER TRIAL OF A MANUAL HANDLING PROBLEM AND
ITS “SOLUTION”

D.Klein, W.S.Green and H.Kanis

Department of Product and System Ergonomics


School of Industrial Design Engineering
Delft University of Technology
Jaffalaan 9, 2628 BX Delft, the Netherlands
e-mail: w.s.green@io.tudelft.nl

A study (Klein, 1997) has been conducted at a major beer company with the
aim of solving some of the manual handling problems of the distribution of
50 L beer kegs to pubs. It started by conducting a user trial of the so called
Keg Buggy, a device developed by the company to reduce the workload.
During the user trial, it turned out that the whole foundation for the
development of the Keg Buggy was missing: the actual problems had never
been analysed properly. This paper shows how strategic design decisions
were made from behind the desk in an early stage, only to find out much
later that the whole product idea was on the wrong track. Secondly, this
project illustrates how a user trial can be used to establish the occurrence of
manual handling problems and especially to determine the reasons for their
occurrence.

Introduction
Distribution system of kegs
The main distribution units of beer for pubs are the steel 50 L beer kegs. They are transported
to pubs in lorries, together with other goods such as liquors, wines and soft drinks. The lorry
crews have all kinds of small equipment to help them get the goods from lorries to pubs. The
goods sometimes have to go over thresholds, through narrow passages, upstairs to pubs on the
first floor or downwards into cellars.
The current Dutch guidelines concerning manual handling do not allow lifting more than
23 kg and pulling more than 200 Newton. When the postures aren’t good these figures are
even lower (Kluver, 1992). Because of the weight of full beer kegs (66,5 kg), it was expected
that these guidelines were often overstepped.

Earlier research of the manual handling problems


As early as 1985 the beer company held an analysis of the manual handling problems during
the distribution of beer kegs (Snijders, 1985). The resolution of that research was low: it only
User trial of a manual handling problem and its “solution” 361

measured the amount of bending needed while distributing the beer kegs. It showed that a lot
of bending occurred, but not why or when.
In 1992 a second project was conducted, analysing which handling methods were causing
problems (Kluver, 1992). It still didn’t produce answers about why the workers used the
heavy and demanding handling methods.

Developing a product to reduce the workload


In 1995, a student in mechanical engineering proposed to redesign a product used for
carrying bricks so that it would solve the problems with transporting kegs. The management,
eager to get results, accepted the proposal.
The student did exactly what was agreed upon: develop a vehicle to handle beer kegs. It was
meant to be suitable for taking kegs from pallets or from the top racks of wheeling containers,
for transporting kegs from lorry to pub, and, especially, for lowering kegs into cellars.

Figure 1: The original product for carrying bricks (a), the first prototype (b) and the
Keg Buggy (c) as tested at the actual delivery situations by the lorry crews. The final
product has an electric winch, a battery and a special hook for grabbing kegs by the
handles or from the side.

The design was prototyped and analysed in “laboratory conditions”: the actual lorry crews
were not involved, nor was the product tested at the actual surroundings in and around pubs.
From this analysis it was concluded that the design of the buggy should be optimised before it
was ready for testing in the actual surroundings. The construction firm that developed a new
prototype still had no analysis of the situations at pubs or the current handling methods. In
1997 the product was finally developed sufficiently for the company to consider it suitable for
testing in a user trial.

User trial of the Keg Buggy and current methods


Lay-out of user trial
The user trial was conducted with the actual lorry crews. They were asked to first show how
they would normally handle the kegs at each delivery situation, before testing the Keg Buggy. In
this way the nature and size of the manual handling problem that was actually solved could be
estimated. The user trials took place across the Netherlands, to even out the inter-regional
362 D Klein, WS Green and H Kanis

differences between building styles of pubs, delivery demands and styles of handling. The buggy
was tested and compared with the normal handling methods on comprehensibility, effectiveness
in reducing the workload, time efficiency and safely. The methods used were video recording
and analysis, interviews, questionnaires and personal experience of the work situation by the
designer.

The Keg Buggy as an all-round keg-handling device


It was soon evident that the buggy was completely useless as an all-round vehicle for
handling kegs. For instance, for transporting kegs over the ground, the buggy wasn’t
competitive with the currently used hand trucks that are smaller, lighter, very tough and can
carry two kegs at a time.
The buggy was also meant to solve the problems with taking kegs from the top racks of
wheeling containers. Normally a worker would pull a keg out by hand and use too much force
while guiding it to the floor. However, the worker already has the alternative of using a drop
cushion to break the fall. That method is much more efficient and practical than any tackling
device can ever be, but still the workers never use it. The reason is that the workers don’t
consider lowering a keg to be a problem and also find getting and placing a drop cushion too
much of a hassle. As the relatively small and light drop cushion is already too much of a
bother to use, the chances of the heavy and slow Keg Buggy ever being used are zero.
The choice was soon made to focus on the most specific and typical function of the Keg
Buggy: lowering kegs into cellars.

Figure 2: Some of the functions of the Keg Buggy. a) Taking a keg from the top rack of
a wheeling container. b) Lowering a vertical keg straight into a cellar over a slide. c)
Lowering a horizontal keg into a cellar in two phases.

Lowering kegs into cellars the normal way


The workers first showed how they would normally lower kegs into each cellar before trying
out the Keg Buggy. Surprisingly, it turned out that in most cellar situations there was no
manual handling problem at all, or only a small one. In the easiest cellar situations neither
worker needs to lift the keg at all: the top man can simply roll the keg through the hatch and
let it fall down onto a drop cushion, or use other methods that are no problem according to the
guidelines concerning manual handling. At other times, the workers would need to lift or
swing the keg from the ground for part of a second, for instance to avoid the falling keg
damaging some pipelines directly under the hatch, to place the keg onto a slide, or to make
sure that a drop cushion is hit in the centre.
User trial of a manual handling problem and its “solution” 363

Figure 3: a) Example of a very easy cellar situation: simply roll the keg in without any
lifting. b) Because the top man has to carefully aim the keg on the centre of a drop
cushion, he needs to swing it away from the edge. c) Walking down with a keg step by
step. d) An awkward combination of obstacles under the hatch and a deep cellar make
considerable force and bad posture unavoidable while lowering a keg.

On only a few occasions are the cellar situations so awkward that they cause considerable
manual handling problems. At one of the situations encountered there was a stairway with
vulnerable marble steps, so each keg had to be carried down. A slide over the steps, as seen at
other delivery situations, would have solved this problem and been more efficient at the same
time. The Keg Buggy, however, formed no solution at all. Other serious manual handling problems
occurred because of awkward combinations of small hatches and obstacles under the hatch.
The conclusion of this analysis was that lowering of kegs into cellars itself doesn’t cause
the manual handling problems, but awkward cellar entrances do. All the methods currently
used for lowering kegs are very efficient. It takes about 20 seconds cycle time to lower a keg
step by step over a stairway and situations with slides and drop cushions can be twice as fast.
Speed is very important to the company, but also to the workers: they prefer to work fast and
save time to drink coffee, to talk or to be back home earlier.

Using the Keg Buggy for lowering kegs


Besides the fact that lowering kegs into cellars isn’t as hard as presumed, the Keg Buggy is
unacceptable as an alternative. Even if all the design mistakes of the prototype were solved, the
whole concept of lowering kegs with a heavy motorised tackling device is much too slow: lowering
the hook system, attaching it to the keg, taking the buggy to the hatch, positioning it, lowering
the keg to the bottom, unhooking the hook system, hoisting up the empty hook and taking the
buggy away for the next keg took at least a minute at the trials, even if nothing went wrong. This
is completely unacceptable to the workers, who are used to doing the same work in a few seconds.

Discarding the concept of the Keg Buggy


The management board that issued the development assignment over two years earlier and
has been debating what to do about the manual handling problems for 12 years, was shown
the results of both the analysis of the current distribution methods and the user trial with the
Keg Buggy. They were quickly convinced to discard the whole concept of lowering kegs into
cellars with something like a Keg Buggy.
364 D Klein, WS Green and H Kanis

Continuation of the project


This paper deals with the first stage of a significant trial, design and development process, the
later stages of which will be reported in detail separately.
A thorough analysis of the manual handling problems and, more importantly, the reasons
for their occurrence, was made to determine which types of keg handling overstepped the
guidelines and why. This is the link which had been missing in earlier studies. Now ideas
could be generated which would solve the real problems.
Several ideas for potential solutions were generated and connected with the various
manual handling problems. Their feasibility was estimated, for instance by conducting
simple user trials at an early stage. It was concluded that the distribution of 50 L kegs could
be made to conform completely to the guidelines, provided that the beer company would in
future have some minimal requirements for delivery situations to rule out the most awkward
and infrequently occurring problems.
The most promising problem-solution combination was chosen to be developed first. The
information regarding the delivery situations and also the mentality and attitude of the
workers, as gathered during the user trials, was integrated into the design process.

Discussion
As demonstrated, the choices that are made in an early development stage can carry a project
in the wrong direction for years. In this case and with the very best of motivation, it was
decided what type of solution was going to improve the working condition before gaining the
necessary insight into the real, as opposed to perceived, problems. By conducting a user trial
of the distribution methods at an early stage, the Keg Buggy could have been eliminated as a
solution, better ideas generated and considerable time and money saved.
User trials are often seen only as tools to seek out and eliminate design mistakes in
prototypes or even end products (Roozenburg and Eekels, 1991), but trials can also be used to
evaluate functions, before any product is developed. User trials can be a source of inspiration
for the generation of new product ideas (Kanis and Green, 1996) in addition to helping
evaluate their feasibility.

References
Kanis, H., Green, W.S. 1996, Deel III; gebruik, cognitie en veiligheid, (Technische
Universiteit Delft, Delft)
Klein, D. 1997, The manual handling of kegs. Graduation report. (Technische Universiteit
Delft, Delft)
Kluver, B.D.R., Riel, M.P.J.M. van, Snijders, C.J. 1992, Toetsing van de fysieke belasting bij
de distributie van bierfusten aan de richtlijnen en normen van de EG aan de Arbowet,
(Erasmus Universiteit Rotterdam, Rotterdam)
Roozenburg, N.F.M. and Eekels, J. 1991, Produktontwerpen: Structuur en methoden,
(Technische Universiteit Delft, Delft)
Snijders, C.J. 1985, Analyse van de fysieke belasting tijdens de distribute van fasten,
(Erasmus Universiteit Rotterdam, Rotterdam)
INDUSTRIAL
APPLICATIONS
Case Study: A Human Factors Safety Assessment Of A
Heavy Lift Operation

W Ian Hamilton 1 & Phil Charles2

Human Engineering Limited, Shore House


1

68 Westbury Hill, Westbury-on-Trym, Bristol, BS9 3AA


2Amec Process & Energy, Unit 2 Altec Centre

Minto Drive, Altens Industrial Estate, Aberdeen, AB12 3LW

This paper presents the results of a human factors analysis of an offshore


heavy lift operation. The lift team comprised 18 roles which were all
analysed using HTA and timeline analysis techniques. The data were then
subject to human HAZOP and communications analysis procedures which
revealed certain operational vulnerabilities. A full set of control measures
was specified to manage these risks. The operation was then performed
successfully. The work serves to illustrate the value which timely
intervention of human factors assessment can bring to a major engineering
project.

Introduction
Part of the development of Marathon Oil (UK)’s Brae B platform to accommodate the
Kingfisher field, required the installation of a new separator and pipework module. Although
these modules weighed 220 tonnes in total, it was determined that they could be transported
to the field on a normal supply vessel and hoisted onto the platform using the drilling draw
works and a cantilevered lifting frame. This was a radical alternative to the traditional method
of using a very expensive heavy lift barge.
This operation would involve a large number of personnel on-board the platform, the
supply vessel, and the standby vessel. From the outset it was recognised that effective
command and control would be critical to the safety of the operation. Furthermore, this was
an activity which was outwith the normal experience of all the participants, as it was the first
time that this technique had been used in the North Sea. To address this, the prime contractor
commissioned a human factors analysis of the lift operation to highlight any potential human
operability hazards and to identify appropriate risk control measures.

Task Analysis

Team Structure
A hierarchical task analysis of the lift operation was performed down to the function level
using the ATLAS tool (Human Engineering, 1996), based on a review of the lift plan
A human factors safety assessment of a heavy lift operation 367

documentation which had been prepared by the customer. This served to identify the
organisation of roles and responsibilities within the operation.
Essentially, the lift team comprised 18 roles as illustrated in Figure 1. In the interests of
brevity only the lead roles (identified by the shaded boxes) are described here. These
descriptions are presented in Table 1.

Figure 1. The organisation of the lift team


Table 1. Lead roles within the lift team

Site Visit
The analyst visited the offshore installation to perform a site inspection and to interview key
personnel who would be involved in the lift operation. As a result of this data collection
activity, the analysis data were verified and extended. Following the revision a full timeline
model of the lift operation was developed. This not only captured the order of activities for
each member of the lift team, but also revealed the dependencies between activities.
This defined the lift operation as comprising eleven stages. At this point the analysis was
also taken down another level of detail. This level defined the co-ordination activities,
368 WI Hamilton and P Charles

responsibilities, equipment used, and training needs for each functional activity. From this it
was possible to map out the transfer of responsibilities between lead roles through each stage
of the operation.

Human HAZOP Analysis


The operational sequence of activities, represented in the timelines, were subjected to a
human hazard and operability (HAZOP) analysis. This is a formal methodology which is
similar to the standard engineering HAZOP procedure (Kirwan, 1994, pp 95–99), but which
makes use of guide words which are more appropriate for the identification of human errors.
In this case, the guide words were organised in the form of a systems ergonomics checklist.
This checklist was applied to each activity and the results recorded in a set of standard data
fields within ATLAS.
The critical information defined for each activity included the following: Operator
(critical role), Command/Initiation, Action, Equipment used, Response, Error/Hazard, Type
of error, Consequence, etc. This process revealed the major hazards and their consequences
associated with each activity. It also classified the nature and cause of the hazard.

Communications Analysis
As expected, most of the hazards arose through potential co-ordination errors. As a result, a
full communications analysis was also performed. This included an examination of all of the
communication facilities available on the Brae B. In addition, the analysis developed a
command and co-ordination sequence which represented the commands which either initiate
or terminate the co-ordinated actions.

Principal Hazards & Risks


The HAZOP analysis revealed a number of human operability hazards and vulnerabilities.
Some of these are summarised in Table 2.

Figure 2. A photograph of the lift operation in progress


A human factors safety assessment of a heavy lift operation 369

Table 2. Principal human operability hazards and risks

Control Measures
The following section describes some of the risk control strategies which were adopted to
minimise the risks which had been identified.

Briefing Packs
To combat the concerns over team competence the training needs for all participants were
specified in detail. Individual briefing packs were prepared for every member of the lift team.
These complemented the verbal briefings and hands-on training which the team members
received. Each contained a full description of the operation, roles and responsibilities,
communication and safety rules, and a personal checklist of actions. The safety rules also
ensured proper response to platform alarms.
The risk of complacency at the second lift was controlled by having a comprehensive
debrief following the first lift. The second lift then occurred after a suitable rest break. Also,
370 WI Hamilton and P Charles

checklists were prepared for each lift and signed off by the required individuals prior to each
lift being authorised to proceed.

Sterile Area
To control the risk to spectators, the lift operation was performed within a sterile area into
which access was restricted to essential personnel, and only then under the authority of the
lift co-ordinator. This rule was enforced emphatically in the vicinity of the moving loads.

Ambiguous Communication
Where the communications analysis revealed gaps in the co-ordination sequence, appropriate
remedial measures were recommended. Also a comprehensive communication protocol was
specified to ensure the use of only reserved and unambiguous language. Similarly, where the
analysis revealed a special vulnerability to communications failure, back-up non-verbal
methods were recommended.
Co-ordination was further enhanced by the introduction of a GO/HALT procedural check
following every stage of the operation.

Positioning Of The Boat


The Captain was aided in the accurate positioning of his vessel by the use of a visual marker
system for N-S positioning; and, by relaying instructions from an observer located on the
lower deck of the platform, who was able to check W-E positioning.

Handling Of Rigging
Strict procedures were developed for the management of the rigging gear on the lift modules. The
same rigging was used onshore and offshore, and the boat based rigging crew attended the onshore
loadout to ensure that they were intimately familiar with the gear and procedures. In addition, the
rigging gear was colour coded to ensure that it was only attached at its designated points.

Failure To Stop Draw Works


To ensure that the Driller would know when to stop the draw works, even in the event of a
radio failure, the lift ropes were marked with coloured bands to designate the various stop
points for each stage of the lift.

Conclusions
The lift operation took place successfully and without incident in June 1997. The operation is
depicted in Figure 2. This case study serves to illustrate the value which can be added to the
planning of a major engineering operation by the timely application of human factors analysis
techniques. In addition, the work illustrates how a wide range of outputs and specifications
can be derived from the analysis data; thus demonstrating that such intervention can also be
highly cost effective.

References
Human Engineering Limited, 1996, ATLAS—A Practical Task Analysis System. (Software
Created For The Apple Macintosh, Version 1.1K).
Kirwan, B. 1994, A Guide to Practical Human Reliability Assessment (Taylor & Francis,
London)
THE APPLICATION OF ERGONOMICS TO VOLUME HIGH
QUALITY SHEET PRINTING AND FINISHING

Mic L.Porter

University of Northumbria at Newcastle


Newcastle-upon-Tyne. NE1 8ST
(0191)227 3155
mic.porter@unn.ac.uk

Four projects undertaken in commercial printing companies have


highlighted many areas of concern to an ergonomist. The companies
involved were all “sheet printers” producing high quality, specialist, cut
items that were then shrink wrapped and despatched to the customer. The
high speed presses used were capable of sophisticated and ultra high quality
printing but could not be set up, maintained or cleaned in an ergonomically
acceptable manner. The printed work was then inspected, “jogged” and
guillotined. The cut sets, perhaps held with a rubber band, are then “fan”
inspected before wrapping and dispatch. In all of these “finishing” tasks
poor ergonomics was identified, remedial actions found and implementation
started. Although desirable, major modification to the presses was not
possible, however, the application of ergonomics was found to be justifiably
beneficial to the organisations.

Introduction
In 1491, possibly the year of his death, William Caxton published “Journals of Health”. In the
subsequent 500 years of sheet printing the fundamental intention of applying ink to paper in
precisely defined locations so that the highest quality reproduction occurs has changed little.
Indeed small “Private Presses” still exist using equipment very much like that of Caxton;
given that metal has, largely, substituted for wood. This is not, however, the case for the
plants in which these ergonomic audits were undertaken. Single colour, single side printing at
rates of one/two sheets per minute has now become 220 (or more) sheets per minute printed
in several colours and with the possibility of both sides simultaneously. However, the printer
and all those that support them still focus on the quality of the image produced to the extent
that virtually everything is subservient to the interlinked twin goals of quality and speed. This
is particularly true for the presses themselves where the centre of attention is the paper path
and not operator ergonomics.
Each of the main stages of the process will now be discussed, however, in some cases
precise details cannot be given for commercial and other confidentiality constraints.
372 ML Porter

Observations from four Printing Plants


Printing
The presses generally operated at between 6000 and 10000 sheets per hour although greater
rates were possible. The largest sheet size that could be handled was 720mm× 1020mm while
970mm×700mm and 810mm×650mm were more typical. The “weight” of the papers ranged
from under 80gm per square metre (gsm) to over 110gsm. Thus when the press is running it
can require between 250Kg and 810kg of paper per hour and, due to the extra weight of the
applied ink and varnish, 260kg to 845kg to be removed. The wooden pallets upon which the
paper was supplied (in 25 team, 12500 sheet loads) could vary from 5.38kg to 7.32kg (mean
of 6.19kg for 5) in one plant to a maximum of 12.13Kg in another. The chipboard protection
for the top of the paper (c1550mm high) typically weighed between 1.6kg and 1.8kg. The
wooden pallets and protection were, when not in use, manually handled and often stacked
above shoulder height in order to minimise the floor area required and the congestion caused
near to the presses. In one of the plants “continuous running bars” were used to support the
paper while another stack was loaded. These steel rods weighted about 1kg each and would
need to be inserted and removed from among the paper stack at about shoulder height.
Powered lift trucks were used for the carriage of paper from store to press and from the
press to the temporary storage/drying area but hand pallet trucks were used for transport close
to the presses. The force required to start to move these hand pallet trucks was often found to
be in excess of 250N and between 100N and 200N was typically required to keep the load
moving. When in the press a similar force was applied to the pallet, with a foot, to hold it
“hard home” while the truck was removed.
During normal running the printers must also transport ink and varnishes. In the case of
the smaller print runs this could come in plastic kegs of 2.5kg, 5kg and 10kg nominal weight
that would require lifting onto/into the press. In one case heavier ink kegs (>25kg) were used
from which the ink could be pumped and while the latter did not require lifting they often
required dragging into position and then large forces were applied to transfer the pump from
the empty to the full keg. The force required for these operations could exceed 250N, the
capacity of the measuring equipment available. Another large, but less obvious,
musculoskeletal hazard associated with the handling of ink is the use of the spatula to transfer
ink from the tub to the rollers in the ink reservoir. The viscosity of the ink would vary with the
precise specification, colour (in one plant red was often found to be stiffer than blue) and,
obviously, the temperature. In order to minimise the viscosity the kegs were often kept in a
warm water bath or balanced on (hot) electric motors.
When running normally the noise of the presses were, about or above, “the second action
level” (90dB(A)) (SI 1989:1790). Chemical hazard; were also present from contact with the
inks and varnishes and their fumes/vapours. In one plant Ultraviolet (UV) cured varnish was
used requiring hazardous high intensity UV lighting sources (PIAC 1993).
When long jobs are running on a press it will, typically, be stopped for routine cleaning and
maintenance during every shift and also in response to changes in print quality. The printing
plates, and possibly “blankets” and “wipers” will also need changing between jobs together
with a full clean to the ink reservoir and ducts. These operations involve heavy manual operations,
often with strong solvents and are generally undertaken in congested areas where the postures
that might be adopted are constrained. A “short” run aluminium plate for a “litho” press might
only weight 0.63k but one “grown” for a large intaglio press might be 5.65kg. The various
The application of ergonomics to sheet printing and finishing 373

rollers and wipers might weigh between 10kg and 32kg while blanket cleaners might weight
30kg and all will require routine exchange. In all cases the ability to adopt desirable postures
and to use mechanical devices to lift and manoeuvre these heavy loads is severely limited.
The press operators were an ageing, all male, close knit group who, although confident in
their work and their ability to produce the highest quality work that their machines were capable
of, were often wishing to move onto fixed shift patterns. This not generally an option, given the
capital investment involved in a large press, is readily justified only for continuous operation. In
one plant a group of five printers all reported suffering pain or discomfort as a result of the work.
In one case their doctor had diagnosed “Tennis Elbow” and in another a non specific wrist sprain
The printers also reported several traumatic injuries, often associated with slips, trips and falls
that had occurred to others. It would not be possible to greatly improve; the fundamental ergonomics
of the presses without a complete re-design, major expenditure and a long period of time.

Sheeting
All of the plants undertook some inspection before the sheets were cut. In the case of the highest
quality products every sheet was checked by an inspector on both sides and the edges and
corners were also “flicked”. This work was generally undertaken by female staff on sloping
benches either standing or sitting on high chairs. In the case of the large sheets (eg 810mm×650mm
or more) sitting was not an option for most as it made it impossible to view or handle the far
edge. The rate of working depended upon many factors including print quality, image complexity,
page size and the extent to which one page “stuck” to the next. A typical workload would
consist of 13–17 reams (500 sheet) daily implying a steady “pull-over” rate of over 2 sheets per
second. Each ream would weigh between 14kg and 23kg and might be difficult to “roll over” or
for the porters (male) to pick up and carry. In another plant, where sheet inspection was less
commonly undertaken the work was carried out standing and the sheets were transferred between
two stacks kept at waist height by the use of pallet “scissor jacks”.
In one case all 22 staff present were questioned. Ten reported some aches, discomfort or
pain in the lower back, neck and shoulder that they associated with work but none regularly
took medication to subdue or “cure” the pain. One person had sought medical treatment for
“wrist sprain” from the company nurse. In another plant 12 staff were questioned and 8
reported aches, discomfort or pain, one was taking “Over-The Counter” analgesics and two,
with wrist pains, had sought treatment from either their own or the company doctor. Injuries
reported also included small cuts from the paper and dermatological reactions to the inks and
varnish. The latter could be a particular problem if it remained undried/uncured by this stage.

Knocking-up/jogging
At several stages in the printing process there can be a need to “break-up” the paper, aerate
the stack and thus ensure precisely the register of each sheet. In the case of the fastest presses
this can be necessary before the paper goes in but in most cases the task only occurs between
the printing and finishing stages. The knocking-up can be entirely manual where a wad of
paper is “flicked”, “shimmied” and the sheets separated. Alternatively, the wad may be
“broken” and then loaded onto a “jogging table” that mechanically agitates the paper, against
an edge. In each case the wad of sheets handled at any one time will vary but 25mm thickness
and 10Kg are typical. Greater loads may be undertaken by staff attempting to “work ahead”
or when the sheet size is small.
374 ML Porter

In most cases the load handled is of concern as not only is the paper difficult to keep
together but it must be “flicked” in mid air, typically about shoulder height, and transferred
from one location to another. The workplace can be set-up with the “jogging table” (c750mm
high) and a supply pallet 1000mm apart with the person standing between the two. Unless
some form of pallet jack was used then the paper transferred from the supply pallet would
vary in height, from above 1800mm for the first sheets to 150mm for the last.
This is heavy work that was always found to be undertaken by males. In one plant only
one person from the nine people undertaking the work reported musculoskeletal pain (lumbar
back) and one more refused to answer. There can also be dermatological risks with this work,
especially if any uncured or undried ink/varnish is present but, with one exception, it was not
thought possible to do the work wearing gloves or barrier creams.

Guillotine Operator
The guillotine operator will receive the paper “knocked-up” and in a standard quantity; usually
1000 sheets. In all cases the guillotines observed were pre-set to cut following the operators
orientation of the paper and command. In all cases the paper was pushed or pulled on low
friction surfaces and the final “cut sets” pushed aside. In this aspect of the work smaller products
not only required more cuts but also more manoeuvring and near maximal reaches into the
guillotine and were, thus, more likely to lead to musculoskeletal injury than larger sets.
The removal of the waste is also of concern, while yields of 95%+ would be designed for but
these could, under some conditions, drop to 75%. Thus 1000 sheets could result in scrap weighing
between 3kg and 10kg; this was often thrown, backhand into high sided skips. In another plant
the scrap was loaded into plastic sacks which, when full, were carried to, and lifted into a skip
with sides 1650mm from the floor. In this plant the mean sack weight was found to be 12.3kg
(10.5kg–14.6kg). The operators generally dealt with two sacks at a time, one in each hand.
The guillotine operators observed were all male and had formed strong “buddy” groups.
In one plant eight operators were questioned and two reported pain/discomfort in their back,
neck and shoulder that they felt was “work related”, neither had sought medical attention
about this matter. The work is, largely, machine paced.

Set Inspection
The inspection of the cut sets can vary greatly depending upon the range of product produced.
In one plant, for example, the sets are inspected, by “flicking”/“fanning” to see if the image
moved, altered in colour, etc. An operation that is repeated for each end/corner and, in some
cases, for both sides. Smaller (eg 135mm×95mm) or near square, set sizes are, effectively,
stiffer and the task more hazardous especially if it is undertaken in free air rather than with
the set supported by the worksurface. In another case only a cursory glance was given to
sample sheets from which the entire set was accepted or rejected. In either case this was a
highly subjective inspection with undefined decisions made as to what the customer would
accept and what they would not.
The distribution of products could vary greatly (Table 1.) and many of the most common
products were found to be economically undesirable to work with. For example, sets of 320–
370mm×140–195mm×70–80mm with weights between 2.97–3.97kg were,
musculoskeletally, difficult to work with but often found. The use of rubber bands to hold the
sets together also created musculoskeletal hazards when they were either put on or removed,
however, they were only used if the sets were handled away from “float” tables.
The application of ergonomics to sheet printing and finishing 375

Table 1. Summary of cut set sizes (two ream) found in one plant

The set inspection work was undertaken by females and, generally, some degree of job
rotation occurred. In some cases work was undertaken while sitting with the product, in tote
boxes, delivered and removed by conveyer belt. In other plants standing staff collected and
despatched work, in sets, on “float” tables. In one plant, where this task was undertaken by
standing staff, it was also rotated (within the shift) with the loading of pallets for despatch
and with the shrink wrapping the bundled product. In this plant twenty members of staff were
questioned. Seven reported musculoskeletal pain or injury that they associated with the work
and three had received medical treatment, carpal tunnel syndrome(2) and lateral epicondylitis
and weakness of grip(1). Three of those that reported no discomfort noted that they had
started on this work less than three months ago.

Wrap and Despatch


This task again required the handling of sets, possibly the frequent removal of rubber bands
and the presentation to banding or shrink wrapping machines in the required bundles. Again
large span grips could be required to hold the product during presentation and, possibly,
during the loading of pallets or the transfer of bundles to storage. The loading of pallets,
especially if they were not adjusted in height, presented musculoskeletal hazards, and the
danger of tripping when walking among them. This work was undertaken in all plant by
females; except for the reloading of the banding or shrink wrapping machines. Although
infrequent this could be a hazardous task. Loads of 23.5kg could require handling, with
constrained postures, into the machines. However, none of those undertaking the maintenance
work reported any injury, nor could any records be found.

Conclusion
Six of the jobs to be found in high quality sheet printing have been described and found to
raise many areas that would benefit from the attention of an ergonomist. In those plants where
this work has been undertaken some actions have already been taken to improve
ergonomically the task, workplace and conditions especially in the finishing stages. Further
changes and better risk monitoring planed for the future. However, the presses themselves
remain a major concern that cannot be tackled to any great extent by the end user. It is to be
hoped that press manufactures are, at present, addressing such issues especially those
associated with cleaning and “set-up” and that the next generation will be much more
ergonomically acceptable than those currently in use.

References
Printing Industry Advisory Committee (1993), Safety in he use of inks, varnishes and
lacquers cured by ultraviolet light or electron beam techniques, (HMSO, London)
Statutory Instrument—Noise at Work Regulations 1989 (SI 1989:1790)
THE APPLICATION OF HUMAN FACTORS TOOLS AND
TECHNIQUES TO THE SPECIFICATION OF AN OIL
REFINERY PROCESS CONTROLLER ROLE

Janette Edmonds1 and Chris Duggan2

1 Human Engineering Limited, Shore House,

68 Westbury Hill, Westbury-on-Trym, Bristol, BS9 3AA

2 BP Oil UK Limited, Coryton Refinery, The Manorway,


Stanford-Le-Hope, Essex SS17 9LL

This paper discusses the use of human factors tools and techniques to solve
real human issues in industry. The paper uses the example of a project
involving the specification of an oil refinery process controller role. The
investigation is used to illustrate the importance of effective and efficient use
of human factors methods, to meet the usual industrial constraints of short
timescales and limited budgets.

Introduction
The aim of this paper is to discuss the application of human factors tools and techniques to
support decisions concerning human related issues.
To support the discussion, reference is made to an investigation which was undertaken on
a BP oil refinery in Essex. The investigation focused on a proposal to introduce a new process
controller role, and, in particular, the implications this would have on the workload for that
role, the effect on team structure and function, and subsequent training requirements.

Background To The Study


The investigation was part of a process of extensive technological and organisational change
taking place on the oil refinery. The plant is divided into five areas: the Cracking Complex,
Fuels, Lubes, Utilities and Product Movements. At the time of the investigation, these areas
were controlled locally in five separate control rooms. The main change for the refinery was
that these five control rooms were to be amalgamated into one centralised control building
(CCB), remote from the plant. In addition, new digital control systems were being introduced to
some areas which relied on manual systems, or a combination of manual and automatic systems.
As a consequence of the move to the CCB, and the subsequent requirement for remote
control, a new process controller (PC) role was proposed for one of the plant areas: product
movements.
Some concerns were expressed regarding whether the new PC role presented unacceptable
levels of workload, and whether the remoteness of the PC from the rest of the product
movements team had implications for the ability of the team to cope with new methods of
working.
Due to these concerns, Human Engineering undertook an investigation to:
Tools and techniques in an oil refinery process controller role 377

• Analyse the workload for the proposed role


• Identify potential impacts on team structure and function
• Identify the training requirements for the team to support the changes

In addition to supporting specific decisions that would be made by the area management, the
aim of this investigation was to support a consultation process with the work force. This was
to ensure that the work force were involved in the development of their own job roles and to
explore the issues that they would have extensive knowledge of as subject matter experts.

Human Factors Intervention

Development Of A Job Specification


The investigation began with an initial phase of consultation to develop a clear outline of the
future job role for the PC working from the CCB. Data were collected separately from the area
management and the nominated persons who would take responsibility for the new PC duties.
The area management were asked to describe the duties that the PC would undertake,
including proposed new tasks and any additional equipment that would be introduced. They
were also asked to describe potential high workload scenarios.
The nominated PCs were asked to describe the tasks currently undertaken during routine
day and night shifts, potential ‘worst case scenarios’, where it was envisaged that the PC
would be under high workload, and how they handle such situations. The plans for new
equipment and new job tasks, which they were aware of, were also discussed. The PCs were
asked to rate the workload for all future tasks, using a five point rating scale. This was used to
anchor ratings between respondents.
From the information gathered, a job specification was prepared, detailing the tasks that
the PCs would undertake from the CCB.

Workload Analysis
The data from the initial consultation were used to develop task models of the PC role, using
a software task analysis tool (Human Engineering Limited, 1996). A series of baseline task
models were developed to reflect busy routine day and night shifts, but without any specific
incidents which would increase the workload significantly, referred to as ‘routine’ scenarios.
The baseline task models were then developed further to reflect situations where
particular types of upsets occurred at different times during the day and night shifts, referred
to as ‘high workload’ scenarios. These included; emergencies on the product movements
area, emergencies on the jetties, and emergencies occurring elsewhere on the refinery. The
task models focused only on the role of the PC.
The task models were verified by the area management and by the nominated PCs as
being representative.
Workload profiles were then calculated for the routine and high workload scenarios. The
workload profiles were ‘demand’ based calculations which calculated the sum of the
workload demand ratings for each task in a given time frame. This was based on the D model
(Aldrich et al, 1988), as follows:

Where Aat=1, if action a at event sequence number t exists and Aat=0, otherwise:
a=actions
t=event sequence number
dat is the demand for action a at event sequence number t.
378 J Edmonds and C Duggan

The workload demand calculation samples the tasks within specified time intervals over
the duration of the 24 hour shift period. For each time interval, the ‘demand ratings’ are
summated. The workload profile then illustrates the summed demand ratings over a specified
period, i.e. 24 hours.
It was found that, during a routine ‘busy’ day or night shift, the PC would be working at
maximum capacity for less than 50% of the shift, and that there would be workload peaks
above one person’s capacity for less than 10% of the shift duration. By closer investigation of
the tasks causing the workload peaks, it was concluded that the workload problem could be
eliminated by effective task rescheduling. In addition, the workload could be reduced further
by acceptable temporary suspension of certain tasks, and through the area teams being aware
of the demands being made on the PC at any given time.
During ‘high’ workload scenarios, however, it was found that the PC would be working at,
or beyond, one person’s capacity for between 29–55% of the duration of the upset, dependent
on the type of upset. It was therefore concluded that two people must be available to work the
panel during upsets.
However, during the initial stages of the upset, it was assumed that support (from the
Process Technician (PT)) would be unavailable, as the PT job was primarily designed to be
outside on the plant.
Despite the equipment being designed to be ‘fail safe’, it was also recommended that the
following provisions were made:

• To increase the availability of support, for example by having more than one team member
and/or other CCB-located people trained to support the PC whilst the PT is unavailable
• To provide additional training to the PC and PT to improve their skills at diagnosis of
upsets
• To provide additional training to the PC and PT to act quickly and efficiently during the
initial few minutes of an upset

Consultation With The Area Teams


The PC job specification and the results of the workload analysis were put forward for
discussion with the area teams. Shift team members were asked to comment on the proposed
organisation of the new team structure and the impact of the PC role on the team function.
The proposed re-organisation of the team activities was based on the following:

• New plant area divisions for product movements (previously decided by shift teams)
• The reduced requirement for continuous supervision of ships on the jetties (due to the
installation of surveillance equipment, electrically operated valves and ship position
alarms)

Outside team members would retain ‘control’ of their areas, but the PC would operate some
of the automatically controlled valves.
The shift team members felt that they could support the PC by:

• Duplicating the surveillance control equipment for the jetties, so that the jetty technician
could take responsibility away from the PC during peak workload situations
• Developing a working practice for ensuring that all team members regularly input relevant
data to the computer system which records the latest tank levels and contents (as the PC
would have greater reliance on this information)
• Maintaining good team communication via the radio network

It was generally felt that the workload of the outside team would increase, especially during
upsets, as the PT would be required to support the PC in the CCB. The following
recommendations were made to reduce the workload of the outside shift teams to enable them
to cope with the changes:
Tools and techniques in an oil refinery process controller role 379

• To improve co-ordination and mutual understanding with other departments within the
refinery for more efficient problem solving
• To reduce unnecessary enquiries by providing an on-line information system which could
be accessed by other refinery departments requiring information about the product
movements plant activities
• To increase the reliability of specific plant equipment
• To develop working practices for product movements and clear roles and responsibilities
for team members
• To develop protocols for emergency situations

Top level task analyses were undertaken for each of the new team roles. Using the task
analyses, a comparison was made between old and new task knowledge/skill demands. A
training requirements matrix was then developed, with support from the shift training
technicians. The key training requirements were as follows:

• Diagnosis of upsets and emergency responses


• Cross area training within the product movements team
• Familiarisation with other refinery departments
• Additional safety and fire-fighting training
• Refresher training for outside team members to support their continued knowledge and
understanding of the PC tasks
• Refresher training for PCs to support their continued knowledge and understanding of the
plant tasks

It was recommended that the training was delivered prior to the move to the CCB, and that it
was a mixture of formal, on the job, and simulation training.

Recommendation
It was recommended that the proposed PC role was viable given the requirements for
additional support, discussed in the previous sections.

Discussion
The aim of this paper is to illustrate the advantages of human factors being applied in a
timely, efficient and highly structured manner. An investigation to specify an oil refinery
process controller role was described to support this aim.

The Case Study


The aim of the investigation was to assess the viability of the proposed new role of the PC, in
terms of the PC workload, the potential impact on the team structure, and any subsequent
requirements for training.
Indeed, the customer required independent advice about the viability of the role, and
wanted a relatively quick assessment without it being prohibitively expensive (in terms of the
assessment itself or any subsequent solutions). In addition, the customer required a neutral
‘body’ to be involved in the consultation process to ease the transition to the CCB.
The customer requirements, therefore, dictated how the investigation needed to be
undertaken. It was essential that:

• The focus of the investigation was on the PC role


• The analyses were conducted in sufficient detail to answer the question, but within budget
and timescale constraints
• The investigation supported the consultation process, as well as answering the question

The ingredients for meeting these requirements relied on three key factors:
380 J Edmonds and C Duggan

• A comprehensive set of task analyses


• A high level of contact with team members and area management, and immediate and
effective feedback between them
• The use of highly structured tools and techniques for conducting the analyses

The task analyses were extremely important for shaping the whole of the investigation.
Through gathering sufficient detail of the PC tasks, and the tasks of each product movements
team member, a great deal of clarity was gained very quickly.
In addition, all areas of uncertainty or concern were recorded and fed back between the
refinery workers and the area management in a timely and structured manner. This helped to
gain clarity at the early stages of the investigation, and indicated to the workers that their
opinions were being heard and given a high level of attention.
The investigation involved the application of a number of human factors techniques.
These included; task analysis, workload analysis, training needs analysis; and, data collection
through interview. To support the process, a software task analysis system was used. This
helped to streamline the process of undertaking the analyses and supported the re-use of task
analysis data for the workload analysis and training needs analysis.

The Case Study With Respect To The Application Of Human Factors


The investigation took two months to complete. The solutions required some minimal
additional costs for training, engineering maintenance time, and time to develop the operating
and emergency procedures.
It was considered that the investigation had answered the questions it intended to answer,
and that neither the human factors intervention nor the resulting recommendations were
prohibitively expensive. It was subsequently recognised by the area management that the
intervention had satisfactorily met their requirements. Indeed, through further discussions
and circulation of the report to other area managers, it became evident to them that human
factors interventions could support them in a variety of similar ways.
The paper demonstrates how the structured use of tools and techniques can support
projects in many ways. In particular, it shows how objective data can be used to answer
specific questions and to support a consultation process. It also shows how the efficient use of
the methods can be effective, even when timescales are short and without being prohibitively
expensive.
It also became evident during the work programme that it was essential for the human
factors intervention to be visible and not seem unnecessarily complex to those involved,
especially when they do not have a detailed knowledge of human factors.
Finally, whilst undertaking this, and other work programmes, it has become clear that the
role of the human factors practitioner is, at least, two fold. Firstly, it is essential to ensure that
work is undertaken professionally, efficiently and cost effectively to answer the project
questions. It is also important for human factors practitioners to market themselves
effectively to increase the awareness of the importance and the range of applications of
human factors.

References
Aldrich, T., Szabo, S., & Bierbaum, G.R., 1988, The Development and Application of
Models to Predict Operator Workload During System Design. In G.McMillan (ed.)
Human Performance Models, (NATO AGARD)
Human Engineering Limited, 1996, ATLAS—A Practical Task Analysis System. (Software
Created For The Apple Macintosh. Human Engineering Limited. Version 1.1K)
FEASIBILITY STUDY OF CONTAINERISATION FOR
TRAVELLING POST OFFICE OPERATIONS

Graeme Rainbird and Joe Langford

RM Consulting,
Royal Mail Technology Centre,
Wheatstone Road, Dorcan,
Swindon, SN3 4RD

This paper describes the feasibility review of a major operational change to


Royal Mail’s Travelling Post Offices, in particular, the introduction of trays
and roll cages for handling mail. Task analysis of the operation and workshops
with TPO staff identified a range of relevant issues. Initial concepts were
developed and evaluated. Technical problems, such as methods for securing
roll cages and trays were solved, at a high level. A risk assessment study was
employed to evaluate the acceptability of the concepts with key stake-holders.
It was concluded that containerisation of TPOs is viable, and that the approach
was a good illustration of the value of employing ergonomics methodology as
a fundamental part of the design and development process.

Introduction
Royal Mail strives to achieve next day delivery for a very high proportion of first class mail.
Letters and packets which must travel long distances across the country can present a
problem. Travelling Post Offices (TPOs) are trains on which mail is sorted in transit to help
meet delivery schedules.
The rail network has been used to transport and store mail for over a hundred and fifty
years. TPO design has changed little over the last 60 years and the newest rolling stock dates
back to 1977. Mail is taken into carriages in mail bags, the mail is tipped and sorted, re-
bagged, and despatched at stations along the route. Simple wooden sorting frames, consisting
of a number of boxes are used to segregate the letter mail. All mail movement and sorting
tasks are done by hand.
Mail profiles (the type and proportions of letters and packets) have changed over time. In
addition, new mail streams and services have been introduced, including ‘Priority’ mail—
high value items which require increased levels of security. TPOs have been adapted over
time in a ‘piecemeal’ fashion to accommodate these requirements.
A more fundamental change to the whole mail distribution network has also taken place
over the last five years. Traditionally all mail was transported in bags. Now containerisation
382 G Rainbird and J Langford

has been introduced, so that most mail is handled in trays, which are transported in roll cages.
This improves the efficiency of mail handling, but has had a profound affect on many
processes and equipment across Royal Mail. TPOs have not yet been containerised. Mail is
still shipped in bags, and as such the TPOs are incompatible with the overall network.
The viability of containerisation of TPOs is currently under review. Ergonomiste in RM
Consulting were requested to: determine the feasibility of containerising the TPO operation,
and identify options for subsequent design and development.

Review of the Existing TPO Operation


The first step was to develop an in-depth understanding of the current TPO operation. Having
ridden the TPO and interviewed staff, a task analysis was completed. At a high level, the task
includes: preparing the carriages for departure; loading the mail to the appropriate carriage
for sortation (mail is pre-sorted by ‘divisions’ which relate to the station at which it is
despatched); sortation of mail at sorting frames following departure; clear down bundles of
mail to bags; and despatch of bags at stations along the line.

Figure 1. An example TPO work plan

Outputs from Workshops with TPO Staff


An early activity involved TPO postal staff and supervisors, and managers from headquarters,
taking part in a workshop to review the current operation and consider the issues for a
containerised system. Participants reported their likes and dislikes and identified areas for
improvement with the current TPOs. Areas for improvement included:
Containerisation for Travelling Post Office operations 383

Equipment
Some features of the current sorting fittings are liked, for instance the glass-bottomed pigeonholes
make checking easier at clear-down. Staff recognised however, that in many respects the fittings
are out of date. For example, there are no frames specifically for large envelopes; the seating is
inadequate and uncomfortable; bag labels are awkward to change and difficult to read.

Environment
Many aspects of the TPO environment are unsatisfactory. The temperature is too hot in the
summer and too draughty in winter. Lighting is poor and there is much dust from mail bags.
Floors are slippery when wet. Train rocking at high speed causes discomfort and occasional
minor collisions between people. TPOs are also cramped, which is a problem during peak
work loads.

Task
Time is the key factor for all tasks, including sorting, clearing down, tying bags and loading/
unloading at stations. The job is generally well liked and there is a good collaborative
atmosphere. The sorting staff like the challenge of meeting targets under challenging
conditions. The most unpopular task is moving mail bags between carriages.

Initial Concepts for Containerisation


Participants at the workshop were also asked to develop their own ideas for the design of a
containerised TPO. Their solutions were drawn up on flip charts and provided a very rich
source of information identifying critical issues. The example shown in Figure 2 is intended
to illustrate the nature of the responses, rather than any specific detail.

Figure 2. An example design for a containerised TPO from the user workshop

The designs from the workshop could all be categorised as being of one of two
fundamentally different types: either: store and sort mail in separate carriages; or store and
384 G Rainbird and J Langford

sort mail in the same carriage. This factor will influence all the other design issues such as
loading methods, storage, floor plans, allocation of tasks etc.
Each type of design has advantages and disadvantages. If the mail is stored and sorted in
separate carriages the staff have minimal contact with the roll cages which minimises hazards
associated with containers on a moving train, and may reduce refurbishment costs. The major
disadvantage is that the mail must be transported a greater distance to the work area before
and after sortation. It was generally accepted by the workshop participants that if the mail can
be stored and sorted safely in the same carriage there will be significant operational benefits.
However, regardless of the option selected, TPO staff must be in contact with roll cages and
trays when the train is moving to be able to retrieve and replace mail.

Figure 3. One of the design concepts for securing roll cages and mail trays.
Containerisation for Travelling Post Office operations 385

For the containerisation of TPOs to be viable therefore, it must be possible to restrain the
roll cages and trays in the event of a disaster such as a train crash. To demonstrate that this was
feasible, a range of restraint designs were developed, with input from industrial designers and
mechanical engineers. This initial process was not intended to identify the final solution, but to
demonstrate that a solution was achievable. Figure 3. shows one potential securing method. In
this example, the straps in front of the roll cage are elasticated to allow access to the trays.

Evaluation and Risk Assessment


The design study showed that in principle it is possible to adequately secure roll cages and
trays. It was clear, however, that there would be different hazards in the operational environment
and significant modifications to the ways of working. It was felt that at this stage it was important
to carry out a more formal assessment to test the acceptability of the concepts to the users and
other key stake-holders. To this end an initial Hazop study was carried out with TPO staff,
managers, safety consultants and representatives of the train operating company. The study
concluded that, in general, the risks would not be any greater than for the current TPO system.
Additional design features were also identified through this process. For example: the roll cage
restraint system should have an interlock system to ensure all containers are secured before
departure; all securing latches etc. should be designed to accommodate staff wearing gloves;
and the doors between carriages must be negotiable by staff using two hands to carry trays.

Conclusions
• Containerisation of TPOs is a viable operational concept which would be acceptable to
TPO staff
• Storing and sorting mail in the same carriage would present an operational advantage, as
time taken retrieving and replacing mail to and from storage would be minimised
• There are many opportunities for improving the TPO equipment, environment and tasks in
an upgraded system

A full-scale model of half a train carriage has been obtained, and a mock rail platform has
been built to allow the loading of roll cages and mail trays. The next stage of the project is to
determine the optimum carriage layouts and to built prototypes. These will be tested with
TPO staff and detailed layout and equipment design issues will be explored. If successful, the
testing and development programme will be continued on real TPOs.
The approach taken by the project team is considered to have been very successful so far
in that:

• User opinions and knowledge, gained at a very early stage, were key to forming the initial
design options
• The logical steps to the process meant that any potential project stoppers would be
identified at a very early stage, thereby preventing unnecessary spend
• User buy-in and communication from the outset are likely to be important if and when the
new system is introduced.
MILITARY
APPLICATIONS
THE COMPLEXITIES OF STRESS IN THE OPERATIONAL
MILITARY ENVIRONMENT

Matthew I.Finch & Alex W.Stedmon1

Centre for Human Sciences


1

DERA Portsdown West


Fareham, Hants. PO17 6AD. UK.
Tel: +44 (0)1705 336424
Email: astedmon@dera.gov.uk

Stress is a highly problematic concept to define, and may be considered as a


generic term for responses to four main categories of stressors. Stressors
arise in a variety of manifestations and effects, depending on the individual
concerned, the situation encountered and combination of stressors therein.
Consideration is given to stressors typically experienced in operational
military environments, ranging from the pilot in a fast-jet cockpit or naval
officer in an operations room, to the soldier on the battlefield. By the very
nature of this environment, conventional methods of stress measurement are
impractical. Attention is focused on literature that details non-invasive
measures of physiological correlates of stress. It is argued that this method is
suited to the complex and dynamic nature of the working environment and
may be exploited to improve selection, training and combat systems.

The Complexity of Stress


Stress can be very loosely defined as “the process of adjusting to circumstances that disrupt,
or threaten to disrupt, a person’s equilibrium” (Bernstein et al, 1988). Whilst this definition is
somewhat vague, it does illustrate the difficulty of trying to formulate a universal definition.
That a recent ESCA-NATO Workshop failed to provide a single standard definition, and that
it arrived at six definitions, illustrates, as Cox states, how “elusive…[and]…poorly defined”
stress can be (in: Murray et al, 1996).
Murray et al (1996) propose that stressors define factors which induce a state of stress in
an individual or situation. Whilst this may not be very helpful in itself, it allows a further
definition of the stress/stressor relationship, which Murray et al place into four distinct
orders. Zero order effects manifest themselves in physical changes brought about by the
stressor; first order effects relate to physiological changes brought about by the stressor;
second order stressors are concerned with psychological changes brought about by the
cognitive interpretation of a stressor; and third order stressors are the re-interpretation of
second order stressors so that stress effects are compounded.
Stress in the operational military environment 389

One of the reasons for this is that the interpretation of stress is highly subjective: it will
differ between individuals and situations and may even differ for a particular individual at
different times. This subjectivity of stress is supported by Albery (1989), whose studies have
shown that, although biodynamic stresses can affect subjective measures of workload, this
effect is not necessarily reflected by objective task performance.
Formalising the stress/stressor relationship in this way, forms a common basis for
interpreting various stress effects so that, for example “researchers into speech produced by
pilots of high performance aircraft are most concerned with zero-order effects [vibration and
G-force], whereas researchers looking at workload would be more interested in second-order
effects [which are more prone to individual changes]” (Murray et al, 1996). What is still
unclear about the concept of stress is that whilst it is possible to identify specific stressors
that cause individuals to become stressed, the cause and effect relationship is still not fully
understood.

A Taxonomy of Stressors
An inherent problem in defining stressful events is that “it is often difficult to define the
characteristics of a specific situation which are stressful” (Baber et al, 1996). In an attempt to
deal with this problem, members of the ESCA-NATO Workshop drew up a taxonomy of
stressors, based on Hayre’s (in: Trancoso and Moore, 1995) four categories: Physical,
Chemical, Physiological and Psychological. This is detailed in Table 1 and elaborated in
relation to combat environments.

Table 1. A Taxonomy of Stressors

One thing that is apparent from the list is that stressors are not exclusive to one particular
category. For example, sleep deprivation may well manifest itself in a number of ways that
390 MI Finch and AW Stedmon

serve to define it under three of the four categories. Furthermore not all the effects of a
stressor may arise in any given scenario. The stress effects on an individual may arise from
independent stressors, combinations of stressors and even the subjective interpretation of
stress effects which may serve to compound the initial stress episode. The effects of these
stressors are, therefore, highly specific, not only depending on the task and operational
environment, but also on particular individual traits.

Measuring Stress
Various strategies have been developed to examine the complex interrelationships of
stressors. Whilst traditional methods for measuring psychological stress have relied upon the
use of questionnaires, another approach is to assess the physiological parameters associated
with stress. However, as Cox (1985) argues, there are no direct physiological measures of
stress, only physiological correlates.
These correlates include such variables as the level of adrenaline in urine samples (Kagan
& Levi, 1975), certain metabolites in blood samples (De Leeuwe, et al, 1992); and REG/EEG
patterns (Montgomery & Gleason, 1992). Endresen, et al (1987) suggest that immunological
parameters may be used as a psychological stress indicator. Psychological stress produces
immunological changes in animals and increasing evidence suggests that this may also be the
case for humans. That said, the relationship between stress and the immune system is
complex, and best understood in conjunction with individual coping and defence
mechanisms. Physiological correlates of stress pertaining to arousal are particularly relevant
to time pressure situations. From their study, Cail & Floru (1987) found that for participants,
in a time-pressure condition, performance, error-rate, EEG Beta Index and heart rate were
significantly higher than in a self-paced condition. Jorna (1993) states, the “monitoring of
heart rate in aviation research provides a global index of pilot workload.” Although
undoubtedly a useful technique, and indeed Jorna’s results indicate that cardiovascular
changes may be a good physiological correlate for stress, he also argues that it is, “more
complex to assess and therefore less often used.” This point is particularly pertinent when
considering the dynamics of the operational military environment which may serve to
confound the measure.

The Operational Military Environment


As Driskell & Salas (1991) state, there are few settings that come to mind when one envisions
extreme stress environments: airline emergencies are one setting, natural disasters are
another, but certainly the military combat environment is one of the most hostile situations in
which humans must operate. In many ways this working environment is unique, with
servicemen operating at the limits of their cognitive and physical ability, endurance and
stamina; in hostile environments, and with acute life/death consequences. In addition the
nature of military life promotes strong social bonds which whilst possibly offering a means of
social support and indirect stress management also imposes a strong sense of peer pressure.
This, in itself, may act as a stressor when individuals are faced with a situation they cannot
cope with but either perceive it as something they should be able to cope with or that others
are coping with.
Stress in the operational military environment 391

Whilst laboratory studies, and studies conducted in benign environments, attempt to


control all but a few stressful variables the military environment cannot be controlled in such
a manner. This has two immediate effects: one cannot carry out experimental procedures in
the field, and if any measures of stress are to be taken they must be done so in neither an
invasive nor disruptive manner. It is clearly just as impractical to ask a pilot to complete a
mood questionnaire during a 9-G turn as it is to take a blood sample from a soldier during a
fire fight. Indeed, these impracticalities mean that psychological or direct physiological
testing of active combatants is impossible. The reactions of an individual to these extreme
stressors may not be known until they are placed in the environment and, when they are, they
must perform their set tasks efficiently. Indeed, as Driskell & Salas (1991) state, “it is
precisely in [the combat environment] that effective task and mission performance is most
critical.”
It is possible, however, to employ specific physiological techniques in the field but one
must be aware of certain practical limitations when these are used. For example, using
electrical skin conductance to assess emotional stress would be affected by sweating,
possibly due to hard physical exertion which in itself may not be a stresser in this particular
instance. Similarly, heart rate monitors would require advanced algorithms to differentiate
between changes due to stress and those due to running while carrying a wounded colleague.

A Case for Non-Invasive Measures


Measurement techniques that are non-disruptive must be, by definition, non-invasive. From a
practical perspective, any disruption to the combat scenario carries risks to the success of the
mission and well being of the personnel. Whilst there has been little research into non-
invasive measures of stress in the field some measurement techniques can be readily
incorporated into systems and kit that servicemen already use. For example, helmet-mounted
EEGs, helmet-mounted blink monitors, and chest-mounted heart rate monitors.
Different working environments will require different characteristics of the equipment
used to measure and record physiological correlates. For soldiers on the battlefield, weight
and size of equipment are of paramount importance. Indeed, with the manufacture of modern
materials, the onus is on the equipment suppliers to produce lighter and smaller end products.
For this reason, any equipment used for measurement must adhere to these principals. It may
be, however, that the pilot in a fast-jet cockpit can be directly attached to more equipment,
which is bulkier and heavier, due to the fact that he/she does not have to physically carry the
equipment or move about extensively within their immediate environment. Weight and size
considerations may be of less importance still for a naval officer in an operations room.
However, within this working environment, consideration must be given to the potential
crowding effects and the need to move around unhindered when the room is at its busiest.
From the literature, significant effects can be found for non-invasive measures in relation
to particular stressors. Monitoring of heart rate, total eye blinks, blink duration and
electromusculargraphs (EMGs) supports the notion that physiological correlates of stress can
be measured without unduly affecting operator performance (Albery, 1989). Morris (1985)
also supports the use of eye blink measurements as predictors of performance decrements due
to stress.
392 MI Finch and AW Stedmon

Concluding Remarks
In order to address the complexities of stress in the operational military environment some
formal framework is required such as the taxonomy detailed above. From this theoretical
basis the delicate interplay of combinations of stressors can be rationalised and accounted for
in the selection and training of personnel, and systems design.

References
Albery, W.B., 1989, The effect of sustained acceleration and noise on workload in human
operators, Aviation, Space, and Environmental Medicine, 60(10 part 1), 943–948
Baber, C., Mellor, B., Graham, R., Noyes, J., Tunley, C., 1996, Workload and the use of
ASR: the effects of time and resource demands, Speech Communication, 20, 37–53
Bernstein, D.A., Roy, E.J, Srull, T.K., Wickens, C.D., 1988, Psychology, (Houghton Mifflin
Company, Boston)
Cail, F. and Floru, R., 1987, Eye Movements and Task Demands on VDU. In J.K. O’Regan
and A.Levy-Schoen (eds.) Eye Movements: From Physiology to Cognition, (North
Holland, Amsterdam), 603–610
Cox, T., 1985, The nature and measurement of stress, Ergonomics, 28, 1155–1163.
De Leeuwe, J., Hentschel, U., Tavanier, R., Edelbroek, P., 1992, Prediction of endocrine
stress reactions by means of personality variables, Psychological Reports, 79(3,1)
791–802
Driskell, J.E., Salas, E., 1991, Overcoming the effects of stress on military performance:
human factors, training, and selection strategies. In R.Gal and A.D.Mangelsdorff
(eds.) Handbook of Military Psychology, (John Wiley, Chichester), 183–193
Endresen, I.M., Vaernes, R., Ursin, H., Tonder, O., 1987, psychological stress-factors and
concentration of immunoglobulins and complement components in Norwegian
nurses, Work and Stress, 1(4), 365–375
Jorna, P.G.A.M., 1993, Heart rate and workload variations in actual and simulated flight,
Ergonomics, 36(9), 1043–1054
Kagan, A., and Levy, L., 1975, Health and environment—psychosocial stimuli: a review. In
L.Levy (ed.), Society, Stress and Disease, Vol.2, (Oxford University Press, New York)
Montgomery, L.D., Gleason, C.R., 1992, Simultaneous use of rheoencephalography and
electroencephalography for the monitoring of cerebral function, Aviation, Space, and
Environmental Medicine, 63(4), 314–321
Morris, T.L., 1985, Electroculographic indices of changes in simulated flying performance,
Behaviour Research Methods, Instruments, and Computers, 17(2), 176–182
Murray, I.R., Baber, C., South, A., 1996, Towards a definition and working model of stress
and its effects on speech, Speech Communication, 20, 3–12
Trancoso, I., and Moore, R., (eds.), 1995, Proceedings ofECSA-NATO Workshop on Speech
Under Stress, Portugal.
THE DEVELOPMENT OF PHYSICAL SELECTION
PROCEDURES. PHASE 1: JOB ANALYSIS

Mark Rayson

Optimal Performance Ltd


Old Chambers, 93–94 West Street
Farnham, Surrey GU9 7EB
United Kingdom

A number of occupations remain physically demanding, requiring a high level


of physical capability for successful performance. Matching worker capabilities
with occupational requirements by selecting personnel who possess the
necessary physical attributes avoids irrational discrimination, reduces the risk
of injury and ensures operational effectiveness. This is the first in a series of
papers which describe the development and application of a systematic approach
to setting and validating occupation-related physical selection standards, using
the British Army as an example. This paper identified and quantified the most
physically-demanding tasks within each occupation. Complexities were
encountered during the data collection and analysis which are discussed.
Criterion tasks which represented common military activities were defined (a
single lift, carry, repetitive lift and loaded march), and all occupations in the
British Army were allocated a level of performance.

Introduction
Despite increased automation in the work-place, a number of occupations remain physically
demanding, requiring a high level of physical capability for successful performance.
Occupations in the Armed Services, Civilian Services (e.g. Police, Fire, Ambulance, Prison)
and heavy industry (e.g. mining, steel, construction) are prime examples where workers
require a minimum level of physical capability to be able to perform tasks required of their
occupation. Matching worker capabilities with occupational requirements by selecting
personnel who possess the necessary physical attributes avoids irrational discrimination,
reduces the risk of injury and ensures operational effectiveness.
Physical selection procedures have been in place for some time in certain occupations. But
where physical capability has been assessed, it is often through tests and standards which
have been derived pragmatically. For example, the British Army have minimum physical
performance requirements for recruits on tests of sit-ups, pull-ups to a bar and a 1.5 mile run.
However, despite the potentially serious consequences of recruiting to these occupations
personnel who are physically sub-standard, these entry criteria have not been assessed against
bona fide occupational requirements.
For occupations where performance is of paramount importance and where the lives and
safety of the public and other members of the work-force may be at risk, the basis for
physical selection requirements should be operational effectiveness. Operational
effectiveness is ultimately dependent upon the ability of each worker to perform the required
tasks to the required standards.
There are several approaches for developing occupation-related physical selection
standards. The preferred approach involves assessing workers on the performance of real
occupational tasks (ensuring a high content validity). However, this approach is often
impracticable, especially for job applicants, for reasons of safety, skill requirements or
logistics. Consequently, tests are often used in place of occupational tasks either as
substitutes (criterion validity approach) or as simulations of all or part of the tasks (construct
validity approach). Whichever approach is adopted, the tests must be predictive, quantitative,
reliable, safe, practicable and non-discriminatory.
This is the first in a series of three papers which describe the development and application
of a systematic approach to setting occupation-related physical selection standards, using the
British Army as an example. This paper describes Phase 1 which involved a job analysis to
394 MP Rayson

identify the most physically-demanding aspects of each occupation. Subsequent papers


describe Phase 2 (setting physical selection standards) and Phase 3 (validation and
application of physical selection standards).

Method
A variety of techniques for gathering data about the physical demands of British Army
occupations were used including questionnaires, interviews, observation, and physiological,
biomechanical and psychophysical measurement techniques.
A job analysis Questionnaire was administered to the Arms and Service Directorates: to
identify the most physically-demanding tasks required of personnel; to obtain a detailed
description of the task elements; and to cluster all occupations which shared similar task
demands. The questionnaires requested detailed information on the most physically-
demanding tasks that all personnel under their command would be required to perform under
peace-time conditions.
Where it was practicable to do so, the tasks identified from the questionnaires were
quantified in the field using multi-disciplinary techniques. The objective of the field
measurements was to provide quantitative data describing the requirements of the tasks for
each cluster of occupations.
One hundred and twenty trained male and female soldiers [mean age 22.2 (sd 3.1) years,
height 1776 (sd 69.3) mm, body mass 73.8 (sd 10.6) kg] performed the military tasks.
Subjects were ‘representative’ of their occupation, medically classified as ‘fully fit’, and
familiar with performing the tasks. The study was approved by the Army Personnel Research
Establishment’s Ethics Committee. Consent was provided by all subjects.
Performance of the tasks was recorded on video tape for documentation purposes and for
subsequent biomechanical analysis. The images from two frame-synchronized cameras were
recorded onto video tape recorders and a video time-code signal was ‘burnt’ onto the video
image. Digitisation was performed using a Peak Video Illustrator. Once digitised, selected
frames were replayed and measurements of distance and angle made using scaling rods as a
reference.
Where possible, direct measurements of the mass of equipment handled by personnel
were made using calibrated weighing scales or dynamometers. Dynamometers were also
used to measure the peak forces exerted by individuals on selected manoeuvres. Sizes of
objects, and horizontal and vertical distances of movements of subjects and objects, were
measured using tape measures. .
Heart rate (HR) and rate of oxygen uptake (VO2) were measured to assess the demands of
some tasks on the cardio-respiratory systems. HR was measured using Sport Testers PE 3000
(Polar). HRmax was measured during a Multistage Fitness Test (Ramsbottom et al, 1988).
Oxygen uptake was measured using Oxylogs (P.K.Morgan).

Results
Responses to the job analysis questionnaire were received from all Directorates. Eighty six
physically-demanding tasks, required of all personnel within an occupation, were identified.
All occupations which shared similar task demands were clustered. For example, the various
medical and nursing occupations were clustered, as their most physically-demanding task
was a common one—to evacuate a casualty on a stretcher.
Observations and measurements were subsequently made on 64 of the 86 tasks. The
detailed results have been published in an internal Ministry of Defence Report (Rayson et al,
1994). The main findings are summarised below.
The frequency of occurrence of the principal actions used during the tasks are shown in
Table 1. Fifty five percent of tasks involved a combination of actions, with lifting and
carrying comprising the majority of these (89%).
Table 1. Frequency of principal actions used during tasks

The vertical travel distance of the lifts ranged from ground level to overhead. The start and
end heights of the lifts are shown in Table 2.
Physical selection procedures. Phase 1: Job analysis 395

Table 2. Start and end heights of lift tasks

The horizontal distances of the carry tasks ranged from 2 to 500m. The distribution of
distances is shown in Table 3.
Table 3. Horizontal distances of carry tasks

The tasks were performed by teams of between one and eight people. Thirty seven percent
of tasks were single-person and 63% were multi-person.
Where objects were handled, loads ranged from 10 to 111 kg per person. The distribution of
loads is shown in Table 4. Where loads and forces were shared by more than one person, a simplistic
approach was adopted whereby the total load was divided by the number of people in the team.
Table 4. Loads handled

Mean % HRmax values ranged from 55% to 88%. The frequency with which the
categories of % HRmax was achieved is shown in Table 5.
Table 5. Percentage of maximum heart rate achieved during tasks

Mean values of. oxygen uptake ranged from 1.16 to 2.92 1.min–1. The frequency with which
the categories of VO2 was achieved is shown in Table 6.
Table 6. Oxygen uptake during tasks

Discussion
The decision to ask Arms and Service Directorates to shortcut the job analysis process by
identifying and defining the most physically-demanding tasks and by specifying minimum
standards of performance, was partially successful Although key tasks were identified, the quantity
and quality of response was variable. As reported in a job analysis of the United States’ Army
(Sharp et al., 1980), the responses represented experienced opinion rather than observed practice.
For some tasks the difficulty in providing a detailed description was understandable. For
example, the Infantry task of “assaulting a prepared enemy position” involved a sequence of
sub-tasks, the precise details of which are dictated by the mission. Some of these sub-tasks
included: an approach march over variable terrain and for a variable distance; ‘closing in’ on
the enemy position whilst alternating sequences of short sprints, crawling and shooting; a
final assault to secure the position; and evacuation of casualties. For other tasks, the
responses were either unnecessarily complicated by respondents, or subsequent observation
and measurement in the field revealed the tasks to be unrealistic.
The importance of obtaining a detailed and accurate description of the job requirements as
the first step in establishing physical selection standards cannot be over-stated. The
components of a task are so intrinsically linked to performance that they directly determine
performance outcome. For example, handling awkwardly shaped, larger, and asymmetrical
objects, or objects which do not afford good grip, or increasing the vertical height, spine-to-
load distance, and frequency of lifting, all decrease lifting capability (Ayoub and Mital,
1989). In view of the fundamental inadequacy in the definition of the tasks, there appeared
little chance that the majority could be used as reliable criterion tasks.
One of the most important findings from the job analysis was the predominance of
material handling activities. With only a few exceptions, the demonstrated tasks involved
lifting, lowering, carrying, pushing or pulling items of equipment. This was not surprising
396 MP Rayson

given the frequency with which material handling activities had been reported by both the
United States’ and Canadian Armed Forces (Sharp et al., 1980; Allen et al., 1984).
Although some extremely heavy lifts were recorded (e.g. 90 kg fork lift, 100 kg generator
lift, 111 kg T-bar lift), the vertical distance of these lifts was often small and the start and
finish heights were largely in the optimal lift range (i.e. ground to waist height). The heaviest
lifts to head height or higher were recorded during bridge-building and lifting generators on
to vehicles: both involved 44 kg lifts per person.
Where tasks involved handling very heavy loads, strategies were adopted by the soldiers
to minimise the stress on the lumbar spine. For example, the 111 kg T-bar was lifted with the
legs straddling the object and the hands positioned between the legs. The 100 kg generator
was raised by extending the legs whilst supporting the generator handle in the crook of the
arms. Involvement of the arm muscles was thereby minimised.
Two thirds of the tasks were multi-person, i.e. they required involvement by more than
one soldier. Some of these tasks involved simultaneous sharing of a given manoeuvre, such as
the numerous two- and four-person lifts, whilst others involved a chain of personnel
consecutively contributing to the manoeuvre, e.g. loading barmines. Both forms of multi-
person activities complicated the process of identifying and analysing an individual’s
contribution in successfully completing the task.
A few studies on multi-person tasks have been published. Kroemer (1974) suggested that
in the case of shared pulling or pushing, the force recommendations for one person should be
multiplied by the respective factor (e.g. two or three) using the assumption that the load is
shared equally. For simplicity, this was the principle adopted in calculating the loads per
soldier cited in this paper. However, this method is likely to underestimate the actual strength
requirements on soldiers (Davies, 1972; Karwowski and Mital 1986; Sharp et al. 1993).
A number of examples of large and variable-shaped objects were measured which
included various missiles, generators and scanners, camouflage nets etc., which compelled
unusual methods of handling. Other objects were either asymmetrical in load distribution
(generators, missiles, drawbars etc.) or had unstable loads (camouflage nets, fuel cans, food
pots etc.) thereby reducing performance (Ayoub and Mital, 1989). Similarly, although the
majority of material handling tasks afforded good coupling between the worker and the
object by the provision of handles, a number of tasks did not.
Tasks which were. judged by the author to involve a significant cardiovascular component
. were
assessed for HR and VO2 response. However, the intricate relationship between HR and VO2 and the
mode (Petrofsky and Lind, 1978; Rayson et al., 1995), intensity and duration (Ayoub and Mital,
1989) of the tasks, combined with the inadequacy in definition of the task components allowed a
very limited interpretation of the cardiovascular data. Future measurement of these variables during
a job analysis is not recommended unless the tasks can be adequately defined.
Many of the measured tasks were skilled, multi-person activities involving complex
manoeuvres usually involving equipment, and often performed in restricted space, forcing
unusual postures. The inability to define precisely the tasks meant that the majority were
performed and measured, self-paced.
Many of the complicating factors described confounded attempts to standardise the tasks
during their demonstration and quantification and conspired against their adoption as criterion
tasks. However, close scrutiny of the tasks revealed considerable duplication and overlap. The
majority of identified short-comings could be overcome and the project progressed by using the
data collected during the job analysis, combined with subject-matter expert opinion, to define
generic military tasks for use as criterion tasks. These generic criterion tasks would remain
task-related, as was originally intended, but would not attempt to encompass every task that had
been identified from every occupation. Rather the generic tasks would be typical and
representative of a cluster of similar activities. The standards could vary by occupation.
The experience in other nations supported the need to develop generic criterion tasks. The
United States Army progressively reduced the complexity of their task classification system
to encompass eventually only lifting tasks which were reduced to five load categories
(Department of the Army, 1982). The Canadian Armed Forces concentrated on four common
military tasks comprising a casualty evacuation (stretcher carry), an ammunition box lift, a
maximal effort dig, and a loaded march (Stevenson et al, 1988). No attempt was made to set
standards which were occupation-specific. Rather, minimum acceptable standards were set
which were common for all personnel.
The strength of adopting generic criterion tasks lay in its simplicity and manageability. A
relatively small number of generic tasks would need to be identified, protocols developed and
minimum acceptable standards of performance agreed. The weakness lay in the shift away
from the real occupational requirements to the notion of generic or representative tasks.
Physical selection procedures. Phase 1: Job analysis 397

However, if the tasks were rationalised empirically and were deliberated and refined by a
team of subject-matter experts, it could be argued, if a legal challenge arose, that all
reasonable action had been taken given the time course and resources available.
Consequently, four generic criterion tasks were developed to represent the key activities identified
in the job analysis. They comprised a Single Lift, Carry, Repetitive Lift and Loaded March. The
feasibility and logistics of administering the generic criterion tasks as ‘the gold standards’ against
which any future tests could be validated, influenced the selection and design of the tasks. The
diversity of physical requirements in the different occupations were encompassed by setting three
standards, referred to as Levels 1, 2 and 3, for each of the 4 generic criterion tasks.
Defining the standards and allocating personnel to levels were achieved on the basis of
both the empirical data from the job analysis and subject-matter expert opinion. Where
objective data existed to set the standards confidently and allocate occupations to a particular
level, this method prevailed. But where no empirical data existed, or where the empirical data
fell between Levels, subject-matter expert opinion was sought. Involving subject-matter
experts allowed greater face validity to the selected tasks and demonstrated acceptability of
the set standards within the organisation.

Acknowledgement
This work was commissioned by the Ministry of Defence to the Army Personnel Research
Establishment, Farnborough, Hampshire, UK. The author wishes to acknowledge the
contributions to this work by DG Bell, DE Holliman, RV Nevola, M Llewellyn, A Cole, RL
Bell, WR Withey, and MA Stroud.

References
Allen, C.L., Nottrodt, J.W., Celentano, E.J., Hart, L.E.M. & Cox, K.M. (1984).
Development of occupational physical selection standards (OPSS) for the Canadian
Forces—summary report. Technical Report 84-R-57. DCIEM North Yorks, Canada.
Ayoub, M.M. & Mital, A. (1989). Manual materials handling. London: Taylor and Francis.
Davies, B.T. (1972). Moving loads manually. Applied Ergonomics, 3, 190–194.
Department of the Army (1982). Women in the Army. Policy Review. Washington AC
20310. USA.
European Economic Community (1976). Council Directive of 9 February 1976 on the
implementation of equal treatment for men and women… Official Journal of the
European Communities, 14 Feb 1976, 1, 39–42.
Karwowski, W. & Mital, A. (1986). Isometric and isokinetic testing of lifting strength of
males in teamwork. Ergonomics, 29, 7, 869–878.
Kroemer, K.H. (1974). Horizontal push and pull forces. Applied Ergonomics, 5, 2, 94–102.
Nottrodt, J.W. & Celentano, E.J. (1987). Development of predictive selection and placement
tests for personnel evaluation. Applied Ergonomics, 18, 4, 279–288.
Petrofsky, J.S. & Lind, A.R. (1978). Comparison of metabolic and ventilatory responses of
men to various lifting tasks and bicycle ergometry. Journal of Applied Physiology:
Respiratory, Environmental and Exercise Physiology, 45, 64–68.
Ramsbottom, R., Brewer, J. & Williams, C. (1988). A Progressive Shuttle Run test to
estimate maximal oxygen uptake. British Journal of Sports Medicine, 22, 141–144.
Rayson, M.P., Bell, D.G., Holliman, D.E., Llewelyn, M., Nevola, V.R. & Bell, R.L. (1994).
Physical selection standards for the British Army. Phases 1 and 2. Technical Report
94R036, Army Personnel Research Establishment, Farnborough, UK.
Rayson, M.P., Davies, A., Bell, D.G. & Rhodes-James, E.S. (1995). Heart rate and oxygen
uptake relationship: a comparison of loaded marching and running in women.
European Journal of Applied Physiology, 71, 405–408.
Sharp, O.S., Wright, J.E., Vogel, J.A., Patton, J.F., Daniel, W.L., Knapik, J. & Korval, D.M.
(1980). Screening for physical capacity in the US Army. Technical Report T8/80, US
Army Research Institute of Environmental Medicine, Natick, USA.
Sharp, M.A., Rice, V., Nindl, B. & Williamson, T. (1993). Maximum lifting capacity in
single and mixed gender three-person teams. In Proceedings of the Human Factors
and Ergonomics Society 37th Annual Meeting.
Statement of Defence Estimates (1990). Hansard, 1, 66:748, 6 February 1990.
Stevenson, J.M., Andrew, G.M., Bryant, J.T. & Thomson, J.M. (1988). Development of
minimum physical fitness standards for the Canadian Armed Forces. Phase III.
Ergonomics Research Laboratory, Queen’s University at Kingston, Canada.
THE HUMAN FACTOR IN APPLIED WARFARE

A E Birkbeck

Ballistics and Impact Group


Mechanical Engineering Department
University of Glasgow G12 8QQ

Warfare over the centuries has changed radically in some aspects. This has
been brought about by improved technology, better understanding of materials
and improvements in engineering. However, there is one limiting factor that
has not changed throughout the ages. They have been called many names:
foot slogger, grunt, the PBI. It does not matter what you call him, he is the
infantry soldier. He has always been the main limiting factor in warfare.

Introduction
Before the advent of motorised, horse drawn or other troop transport, the only way an army
could move about the country was by covering the ground on foot. This practice carried on until
the 1860s and the American Civil War where for the first time railroads were used to transport
whole armies. In this day and age with advanced mechanisation it may seem superfluous that
the soldier should still be phyiscally trained in order to improve his stamina and endurance but,
during the Falklands war in 1982 and due to the lack of transport, the infantry had to resort to
the traditional method of cover long distances, by carrying everything and walking!

Marching and load carrying

Marching
During the Roman times a legionnaire travelling between countries had to carry everything he
possessed: his armour, his weapons and his food. In all, the total load was about 30kg (it is
not clear if this figure includes the weight of the three stakes each legionary carried for the
purpose of building a palisade when they camped each night). On campaign they were
expected to march between 40 and 48km/day and called themselves Marius’s mules because
of the weight they had to carry (Watson, 1983). The soldier of today carries between 25 and
30kg and, when not being ferried in an armoured support vehicle, he can march 35 to 45km/
day. A point that should be take into account is that the Roman mile is .92 of the statute mile
The human factor in applied warfare 399

and, given the time allowed to march the distances recorded, the pace works out very close to
the British Army’s rate-of-march of 3 miles/hour with a ten minute halt included (British
Army Drill Manual). There have been times when loads have been in excess of those
mentioned but these have been exceptional circumstances e.g. the Normandy landings
(Wilmot, 1952). Likewise the parachute landings at Arnhem but the distances on these
occasions were comparatively short and not for day after day of constant marching.

In the Middle Ages there was a regression in the ability of an army to cover the distances that
the Roman armies could and did travel. One of the main reasons for this was the make up of
the army. With the downfall of the Roman Empire and its armies and with the slide into
feudalism, the king no longer kept a large regular standing army. At the prospect of a war
each of the nobles who supported him was expected to bring with him a number of his local
people, notably archers and spearmen. While the archers were expected to train in the use of
the bow (Hardy, 1992) they were not trained in other aspects of military discipline (Lloyd,
1908) such as marching, load carrying, or skill with the sword, which was considered a
gentleman’s weapon. The army itself moved, not in an organised column of marching
but rather in the manner of a football crowd, travelling at the pace of the slowest member
and only assuming any sort of formation as they drew close to the enemy. The distance
marched each day was 19 to 24km (Hardy, 1992) and was controlled by the terrain they had
to cross with the baggage wagons. As one can imagine the weather played a large part in this
as well.

Load carrying
In a medieval army the only people with any sort of weight to worry about were the knights
with armour that weighed about 30kg. On the march this would not be worn but was carried
on the accompanying wagons. The infantryman, apart from his helmet and possibly a leather
jerkin, only had his bow or spear and the clothes he stood up in. In some cases individuals
may have had a sword or knife looted from a previous battle and in accounts of the battle of
Agincourt each bow and spearman also carried a 6ft stake (Bradbury, 1985). From this we
can assume that the weight carried by the soldier was not excessive.

In modern times the need for the infantry soldier to march any great distance has been
reduced by the availability of motorised transport but there are times when this is not
possible. These occasions do not often occur but the one great advantage the modern infantry
soldier enjoys over his predecessors is his personal load carrying equipment. The legionnaire
carried his equipment with the aid of a T or a Y shaped pole (Upcott and Reynolds, 1955;
Lloyd, 1908) balanced over his shoulder in the manner of a bricklayer using a hod. The
medieval soldier carried his possessions wrapped in his blanket and slung over his shoulder, a
practice that was carried on during the American Civil War in the early 1860’s. The modern
infantry soldier’s equipment is worn basically on his belt with the load being supported by
padded straps over his shoulders. The style of the equipment has changed and improved since
the start of this century but the basic application is still there and for all the improvements in
the equipment and materials the modern soldier still cannot carry a greater load and still
function any more successfully than could the Roman legionnaire.
400 AE Birkbeck

Arms and Armour

Arms
Another aspect with which a soldier has to become familiar with is “skill at arms” whether in
modern times with a rifle or in medieval times a bow. In the middle ages the long bow ruled
the battle field because of its range and ability to strike at the enemy and prevent him closing
on a friendly position. The archer in peace time was exposed to a rigid training regime. He
practised as a youngster with a light-draw bow. As he got older a heavier draw bow and
heavier arrows increased his strength and marksmanship (Hardy, 1992) so that in time of war
he would function in a cohesive body, In battle the archers would open fire at maximum range
of “330 yds”, ie 300m, using volleys of arrows, at a rate of up to 17 arrows a minute. At this
rate of fire they had 5 or 6 arrows in the air at any one time (Strickland, 1997). As the range
decreased to 200 yds the best shots would start to select individual targets and their rate of
fire would drop to between 8 and 10 arrows a minute.

An infantryman before World War I was trained to use his rifle at extreme ranges varying from
800 to 1000 yds but during the First World War experience showed that the ability to shoot at
1000 yds was superfluous in modern warfare. During the latter part of the war and up to to-day
most military rifles are designed with sights that do not go beyond 500m. Today’s soldiers,
armed with an assault rifle, are not expected to engage the enemy beyond 400m and the effective
battle range is 300 m (British Army Individual Skill at Arms Manual, 1975). The rate of applied
fire is 8 to 10 aimed shots a minute, a range and rate of fire that would be familiar to the
medieval bowman as the maximum range of his arrow and his aiming capability.

One noticeable difference between the bowman and the rifleman is their weapons. The bow
weighs 1.5kg while the rifle weighs 4.5kg. The bowman has to be strong enough to be able to
draw a bow with a draw weight between 85lbs and 150lbs or greater (Hardy, 1992), and hold
it steady long enough to aim and loose an arrow. The rifleman has to be able to hold his
heavier rifle and carry out the same drill as the archer before firing. The limiting factor
appears to be the same for the bow or rifle, each needs to have the strength to hold the weapon
steady, the ability to acquire a target, have a steady aim and a controlled firing technique. It is
this sequence of operations that restricts the number of targets and the range at which they
can be seen and engaged by the individual.

Hand-thrown weapons such as the Roman pilum weighed 3kg and had an effective range
about 30m (Upcott and Reynolds 1955). While a modern hand grenade weighs about a third
of this, it still thrown the same distance and not 3 times further. The weight of the object does
not appear to matter very much as it comes down to the individual person. Modern javelins
are thrown over 100m but they weigh about 1kg and are launched by running and then
throwing whereas the pilum and the grenade were thrown from a standing position without
the benefit of a run.

Armour
Armour has been worn throughout the ages. During the Greek wars before 490 BC the armour
weighed about 40kg and required a slave to carry it between battles (Lloyd 1908). After 490
The human factor in applied warfare 401

BC, with the change to a more mobile style of warfare, the armour evolved and became lighter.
This is shown in the changes to helmets, from the earlier heavy pot helmet to the Corinthian
style. These used a thinner wall section of the same material and a better overall design and
ended up weighing between 0.9 and 1.5kg (Blyth, 1995). In contrast, the modern battle helmet,
constructed of a polymer composite material, weighs between 1.2 and 1.5kg (Courtaulds, 1997).

During the early Roman period (200 BC to AD 40) the body armour was of a chain-mail
construction and weighed 12 to 15kgs. After AD 40 the style changed to plate armour in the
form of the Lorica Segmentata that weighs about 9kg. Today’s modern multi-layered body
armour weighs 8.6kg, very much in line with that of the late Roman armour, In the Middle
Ages the armour worn by the knights covered the whole body but even that became lighter as
it became more common to fight on foot as opposed to being on horseback. This was mainly
because of the threat of the longbow. The common style of armour used during this period
weighed around 30kg and the only other item the knight had to carry was his chosen weapon.
Again it appears that 30kg is the weight a fit man can carry on his person and still function
normally, whether legionary or modern infantryman.

Diet
Carrying his food was a new concept introduced at the time of Julius Caesar. All the armies
from earlier times, such as Alexander the Great’s, had to forage for their food and rely on the
surrounding countryside to sustain them. This method of feeding the army has only one
advantage and several disadvantages. The one advantage is that there are no supply lines and
no baggage train. The disadvantages are that the diet cannot be controlled and they have to eat
what is available. If one is in the same area for some time, foraging will use up the local
resources and the army then goes hungry. Also one is at the enemy’s mercy if they decide to
carry out a scorched earth policy and deprive the army of all possible food supplies. The
advantage gained by carrying a food supply is that it allows an army to be independent of the
immediate environment.

The Roman soldier lived as part of an 8-man squad that organised and carried its own camping
and cooking equipment. Evidence on display in the Hunterian Museum, University of Glasgow,
points to a surprisingly varied diet which included pasta, meat, fish, fruit and vegetables. He
also had beer and wine to drink. This was supplied to each squad along with a ration of grain
from the stores, which was ground into flour and baked into bread and rolls. From this it can be
deduced that the Roman soldier had a reasonably healthy diet which would reflect in his general
health and strength, an advantage when it came to training exercises. Physical training sessions
were included in the basic training and the soldiers were encouraged to take part in sports such
as running, jumping, swimming and carrying of heavy packs in order to increase their stamina
during route marches and weapons training (Watson, 1983).

In contrast, one has only to read through the numerous books about warfare in the Middle
Ages to realise the desperate living and eating conditions. As noted earlier the armies then not
only moved like a football crowd, they also ate like a plague of locusts, taking everything
from the surrounding area. Even in the Napoleonic times the idea of the army supplying
402 AE Birkbeck

rations to its soldiers was only partial and the troops had to forage to supplement the rations.
During the 19th century, the Commissary Department started supplying the army with food
although in the 1850s, during the Crimean War, the system broke down and the army was
reduced to living in conditions like those of the Middle Ages. This war shows the importance
of a regular supply of food being available to the troops. Rationing during World War II was
even extended to the civilian population in the UK and the diet, while not very exciting, was
designed to give everyone the correct balance for a healthy life. Today’s soldiers fare much
better than those of the past, having access to a wide variety of food to sustain them and allow
them to be able to carry out the strenuous tasks they are asked to perform.

Conclusion
It is apparent that, irrespective of the changes in the technology of warfare over the past two
millenia, the dominant constraint has been the infantryman. He cannot function properly if he
is expected to march further than 35 to 45km/day or carry a load or wear armour greater than
30kg in weight. Full armour has not been used for several centuries but helmets are still in
use. History has shown that helmets heavier than 1.6kg are not practical, as the neck cannot
support greater loads during vigorous physical exercise. The effective distance of hand-
thrown weapons, irrespective of their weight, is still much the same, implying that the human
arm is the limiting factor. Likewise, for weapons that require to be aimed, 300m appears to be
the optimum range that a target can be seen clearly, acquired and hit. Finally, over the years,
the practice of supplying the soldier with regular food has helped him to perform at greater
efficiency. To this day nobody has invented anything better than the well-equipped, well-
trained and well-fed Infantry Soldier.

References
Blyth P.H., 1995, Proc Light Armour Systems Symposium, RMCS Shrivenham.
Bradbury, J, 1985, The Medieval Archer, pp 130–131.
British Army Drill Manual, appendix “C”, Time and Pace pp 157.
British Army Individual Skill at Arms manual, 1975.
Chester Wilmot, C.F., 1952, The Struggle for Europe, pp 223 and 239.
Hardy, R., 1992, Longbow: A Social and Military History, 3rd Ed. pp 102, 212–216, 218,
J.H.Haynes & Co Ltd.
Lloyd, Col (retd) E.M., 1908, A Review of The History of Infantry, pp 3, 37, 75, Longmans,
Green, and Co.
Strickland, M., Personal communication, Dept of Medieval History, Glasgow Univ.
Upcott, Rev A.W., and A.Reynolds, Caesar’s Invasion of Britain, pp 17, 18, translated 1905,
15th Ed, 1955.
Watson, G.R., 1983, The Roman Soldier, 2nd Ed, pp 54–55, 62, 65–66, Thames and Hudson,
Pitman Press, Bath.

Acknowledgement: Thanks are due to Mr B McCartney of Glasgow, for access to his private
library.
AIR TRAFFIC
MANAGEMENT
Getting The Picture:—Investigating The Mental Picture Of
The Air Traffic Controller

Barry Kirwan, Laura Donohoe, Toby Atkinson, Heather


MacKendrick, Tab Lamoureux, & Abigail Phillips

Human Factors Unit, ATMDC, NATS, Bournemouth Airport, Christchurch,


Dorset, BH23 6DF

Air traffic controllers have a mental representation of what is happening on


the radar screen in front of them, including what has happened, what is
probably going to happen, what could happen, and what they would like to
happen and are in fact trying to achieve. This representation, whether
primarily pictorial in nature or verbal or both, is generally referred to as ‘the
picture’. Controllers talk of ‘having the picture’, as a necessary prerequisite
for controlling air traffic, and also talk about ‘losing the picture’ as a rare
event in which their abilities to control traffic break down. Future
automation may impact upon this picture, and so it is useful to gain an
understanding of the picture, what it is and how it works, and what affects it.
This paper reports the initial results of a series of interviews with controllers,
and insights from a recent experiment, which shed some light on the
complexities and potential variations of this mental representation.

Background
Air traffic controllers control air traffic primarily based on a radar display and flight strip
information, the former showing aircraft position and flight level, and the latter giving an
indication of where the aircraft originated, their destination, and by what route. Air traffic
moves within a three dimensional space as a function of (the fourth dimension) time. The
controllers therefore have a two-dimensional ‘picture’ in front of them (via the strips and the
radar picture which updates in real time), but they must also be predicting where the aircraft
are going to be in the near and medium future, and be aware of other possibilities, such as
unplanned deviations in heading or (more rarely) altitude, and speed, etc… They must
therefore have a mental ‘picture’ which operates in four dimensions. This is necessary in
order to detect and avoid any potential conflicts between aircraft, and to facilitate the smooth,
orderly and efficient (called ‘expeditious’) movement of air traffic. In order to do this,
Investigating the mental picture of the air traffic controller 405

controllers generally agree they need to have an internal or mental ‘picture’, which is based
on the actual radar picture, strip information, and communications, etc.
As air traffic density continues to increase in the near future, it is highly likely that some
form of computer assistance or automation will need to be implemented to enable the
controllers to handle the increased traffic load (and their own workload) safely and
efficiently. For example, it is planned to replace paper strips with either electronic versions,
or to use object-oriented displays attaching the information to the labels on the radar display
which show the position of the aircraft. Communications techniques and equipment are also
likely to change, requiring less oral radio-telephony between controller and pilot, and
instead relying more on computerised messages which will be ‘up-linked’ and ‘down-
linked’ between air traffic controllers on the ground and the aircrew in the cockpit.
Additionally, several new tools are in development to supplement controller functions, such
as conflict prediction and resolution, and the sequencing and spacing of aircraft on final
approach to an airport runway. The question is, what impact will such tools or supportive
semi-automation have on the controller’s performance, and in particular on the controller’s
‘picture’?
In order to understand the picture, the first phase of research has proceeded in two related
strands. The first strand involved carrying out a series of interviews with approximately
twenty operationally valid air traffic controllers on the nature of the picture. The second
strand of the research has focused on a detailed investigation of two controllers’ situation
awareness and eye movements in a series of real-time simulations. This required the use of
situation awareness debriefs (a modified SAGAT technique—Endsley and Kiris, 1995) and a
head-mounted eye tracker, together with a limited amount of auto-confrontation (where
controllers review their own performance in handling traffic—they are able to do this via a
video replay of their eye track superimposed on the recorded scene).

Results

Interviews on the picture


Table 1 shows the main interview questions asked of the controllers, and Table 2 shows some
of the types of answers gained from the interviews. Clearly there is a diversity of ideas on
what is meant by the picture, indeed one respondent stated that there was no picture, that
having the picture was a euphemism for being confident and skilled enough to handle traffic
fluently and safely. From a safety perspective what was most interesting were the insights
gained into what can make maintaining the picture difficult. This can lead to ‘losing the
picture’, a catastrophic breakdown of the ability to control traffic. Controllers who had had
this unnerving experience stated that they recognised this was about to happen (by getting
behind on tasks and becoming purely reactive rather than proactive), and called in support
either to help them to regain the picture, and their confidence, or to take over. This extra
person effectively adds cognitive ‘processing power’ to the task.
Given the experience the aviation industry has had with the introduction of cockpit
automation and its effects on pilots (e.g. Billings, 1997: creating many secondary automation
management tasks, and increasing workload just at the point where something unusual
happens on the flight deck, etc.), considerations based on insights such as those in Table 2
406 B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and A Phillips

(Q10), may help forestall similar problems occurring for ATC as future automation and
consequent interface, procedure and role changes are introduced.
The interviews undertaken during these studies represent a first set of data, and it is
intended to continue conducting these ‘picture’ interviews, since new information is still
being generated from the more recent interviews. It is also desirable to extend the range of
ATC jobs/positions being analysed, to gain a full appreciation of different picture types and
aspects. One interesting ATC position to analyse would be oceanic control, since it is
procedurally controlled at present, without a radar picture, though this may change in the
future as new ATC technology arrives.

Table 1—Sample interview questions on ‘the picture’

Experimental investigation of the picture


Given the important caveat that only two controllers were the subject of this investigation
(each with 7 years of operational experience), the investigation nevertheless produced some
interesting results. The main result, gained from reviewing the eye tracking data, the situation
awareness scores, and the result of the auto-confrontations, was that these two controllers
appeared to have entirely different notions of what constituted the ‘picture’, although their
performance on the same traffic samples was similar and met fully the requirements of
the job.
The first controller had a primarily visual picture, based on the radar display. This
controller maintained the picture with regular circular scanning of the radar display including
Investigating the mental picture of the air traffic controller 407

its periphery, and by frequent contact with the aircraft, and by monitoring and updating the
strips. This controller’s picture was good globally, i.e. more was remembered by this
controller during the situation awareness debriefs immediately after each exercise. At one
point during the study, the simulation was frozen after this controller had spent some time
monitoring and re-ordering the strips, temporarily ignoring the radar display. The situation
awareness measure showed that the controller did in fact have all the required information
about the aircraft, but the locations were all similarly inaccurate, relating to the last visual
sweep of the controller that had occurred. This suggested a strong visual and topographical
picture for this controller.

Table 2—Typical answers to a subset of the questions

The other controller had less global awareness. This controller’s awareness was on a more
local level, and so the situation awareness debriefs showed less information on locations and
details of aircraft, except those this controller had been dealing with at the time (for these
aircraft the SA was good). This controller appeared to have a more verbal and less visual/
spatial ‘picture’, in the sense of having a list of priorities of aircraft to deal with in sequence.
Once these aircraft had been dealt with, it appeared that information on them was released
from working memory. This processing and discarding of information has been noted before
in a study of expert versus novice controllers (the latter appeared to try to remember
everything, but the experts’ performance was better, as they only remembered the essential
information—Stein, 1993).
408 B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and A Phillips

What is interesting is that performance was equal (in terms of parameters such as conflict
avoidance, and quality of service to aircraft) between the two controllers, though one
controller had a far better global situation awareness than the other. However, it should be
noted that, due to the unpredictability of conflict situations (where two aircraft can
potentially lose standard required separation, and therefore have a risk of collision), the
simulation freeze and SA debrief rarely coincided with such an incident, which is where a
performance difference (in terms of conflict reduction) between the two picture ‘styles’
might manifest itself. Taken together with the eye tracking data and the auto-confrontation,
however, this pilot experiment suggested evidence for at least two different picture types,
scanning strategies, and processing styles.

Discussion
There are potential implications in the results presented for future ATC systems. Firstly, the
range of picture types needs to be fully understood by ATM system designers. Secondly,
certain picture types may favour certain future ATC display and support-tool concepts more
than others. Thirdly, there is the question of which picture type is best (safest; most
expeditious; maximising situation awareness and optimising workload), if indeed there is a
‘best picture’, given projected future traffic levels. Fourth, what parts of the picture should be
left to the controller, and which parts, if any, supported or even automated. As traffic
increases, is it tenable that the controller will be able to maintain the picture, or will there be
more reliance on automation, or will the controller have a very different picture in the future?
Fifth, how will the role of the controller change, e.g. will the controller’s job become more
supervisory in nature, and does such a role change necessitate giving up the picture, and will
the controller still be able to intervene effectively in such a role? Sixth, and more
fundamentally, how is the picture first evolved during training, and do individuals have a
predilection for certain picture types, or can any controller learn to have a particular type or
style of picture? It is hoped that the future planned research can at least begin to answer, or
give insights into, some of these questions, in a practical ATC context.

Acknowledgements: The authors would like to thank all the controllers that participated in
both studies.

Disclaimer: The opinions expressed in this paper are those of the authors and do not
necessarily represent any policy or intent on behalf of the parent organisation.

References
Billings, C.E. (1991) Aviation Automation: The Search for a Human-Centred Approach,
(Mahwah, NJ: Lawrence Erlbaum Associates).
Endsley, M.R. and Kiris, E.O. (1995) Situation awareness global assessment technique
(SAGAT) TRACON air traffic control version user guide, (Lubbock, TX: Texas Tech
University).
Stein, E.S. (1993) Tracking the visual scan of air traffic controllers. Proceedings of the 7th
International Symposium of Aviation Psychology, Vol 2, 812–816.
Developing a predictive model of controller workload
in Air Traffic Management

Andrew Kilner; Michael Hook; Paul Fearnside; Paul Nicholson.

Human Factors Unit, Air Traffic Management Development Centre,


Bournemouth Airport, Christchurch,
Dorset, BH23 6DF, UK.

Workload has long been used as a metric to indicate system performance and
operator strain (Moray 1979). The National Air Traffic Services (NATS), Air
Traffic Management Development Centre (ATMDC) uses a unique
methodology and toolset based on Wickens’ Model of Multiple Resources
(Wickens 1991) to analyse and predict workload for a given set of
circumstances. PUMA (Hook 1993) (Performance and Usability Modelling
in ATM (Air Traffic Management)) uses a task analytic approach to analyse
workload and infers workload on the basis of observational task analysis,
cognitive debriefs (auto-confrontation) and interviews with subject matter
experts. A large and detailed model of a concept of air traffic operation is
developed and is subsequently used to predict workload. The following
paper describes the PUMA workload analysis process and reviews several
projects in which PUMA has had an impact in terms of Human Machine
Interface (HMI) design, and workload assessment.

Introduction
In order to conduct research and development work in ATM at an early stage in the design
life-cycle, the ATMDC use the PUMA toolset as one of its methodologies. PUMA is used in
part to evaluate prototype tools for the Air Traffic Controller (ATCO) which enable more
aircraft to be handled. The PUMA toolset enables the determination of controller workload
given a particular way of working (known as an operational concept or OC), and a particular
airspace sectorisation, route structure and traffic sample (known as a scenario). The analyst
can thus carry out initial investigations to determine those areas of an operational concept
which are particularly workload intensive. The effect on workload of proposed changes to the
operational concept (e.g. task restructuring/interface reconfiguring) may also be examined
i.e. the PUMA toolset offers the ability to predict workload. This approach is possible
because PUMA calculates an estimate of the overall workload placed upon the controller by
each of the tasks and actions undertaken by that controller (tasks and actions are the basic
410 AR Kilner, M Hook, P Fearnside and P Nicholson

levels of controller interactions with the ATM system). By varying the sequence of tasks and
actions, the operational concept can be varied to investigate alternative methods of working.

The PUMA Toolset and Methodology


The PUMA toolset consists of a number of window-based graphical editors. These allow the
user to define both the operational concept and the scenario, and hence, to generate estimates
of controller workload. A variety of editors allow the user to examine and edit operational
concepts and scenarios. These facilities may be used, for instance, in an experiment to
optimise the structure of a given ATM task.

The PUMA Methodology/Process

Video/Observational Task Analysis


Video recordings of the controller (over the shoulder), the radar/interface equipment and a
recording of the controller’s face are taken and combined into a single picture during real
time simulations and operational analysis. In such trials the controller is required to provide
instantaneous self assessment (ISA, Kilner 1994) data pertaining to how much workload the
controller is experiencing. This assists in the identification of high workload areas of the trial.
Such high workload areas are then examined after the trial during a cognitive debrief. A
period of the trial (typically of 20 minutes duration) will then be translated by hand into
timeline format, detailing all observable tasks and actions for a given controller role. This is
known as the observational task analysis (OTA).

Cognitive Debrief
During this part of the methodology the PUMA analyst and ATCO work through a structured
cognitive debrief, using verbal protocol analysis. The aim is to capture all of the covert
(cognitive) activities of the controller (including judgements, decision making, and
planning). The covert actions are then be entered into the OTA file, which contains all overt
and covert actions associated with the period of controller work of interest.

Task Modelling
Controller actions are modelled in terms of their start and end times, and their times of
occurrence. Tasks can then be defined in terms of which actions constitute which tasks.
Typical tasks include coordination, accepting aircraft and resolving conflicts between
aircraft. Tasks are then “generified” or averaged to produce a standard version of each task in
terms of its typical duration and composition, generification allows the difference between
different controller styles to be accounted for in the final task model.

Workload Attribution
Once a model of generic tasks exists, the Workload Analysis Tool may be run. This particular
tool uses a British Aerospace (BAe) implementation of Wickens’ “M” model. The M model
contributes to the calculation of overall workload on the basis of a number of limited capacity
information processing resources, or channels, that allow the controller to undertake actions.
A predictive model of controller workload in air traffic management 411

The use of a conflict matrix within PUMA adjusts the workload calculations in terms of the
penalties (extra workload) incurred by two or more processing channels interfering with one
another as they compete for cognitive processing time. The Workload Analysis Tool provides
representation of calculated workload over time as the generic tasks examined by the toolset.
In this way, the workload intensive tasks can be identified directly using the workload graph.

Task & Action Verification


Once tasks have been generified, their structure is then reviewed to ensure that they are
operationally realistic. This process entails the examination and editing of generic tasks by a
subject matter expert. Here, a suitably experienced controller will examine each generified
task in terms of its constituent actions, their duration, sequencing and associated workload.
Once the generic tasks and their associated workload profiles are approved, further analysis
and optimisation can begin.

Workload Analysis
The workload associated with any particular generified task can be calculated so as to
determine its contribution to the overall workload resulting from a sequence of tasks. This
means that the high workload tasks, or task-combinations, can be studied and optimised in
terms of their sequencing. The effects of implementing a new technology concept or HMI
may also be explored. An additional facility for further task workload analysis is known as
the static analysis. In such analysis, data are provided on the number of occurrences of each
task, and on the percentage of time which the controller would spend performing that task or
action. This information can aid in the identification of (predominant) tasks which might
benefit most from optimisation.

Previous Applications
The following section describes two studies in which NATS has applied PUMA in different
settings. In the first project, Programme for Harmonised Air Traffic Management Research in
Eurocontrol (PHARE), PUMA was used to analyse a real time simulation, make
recommendations for change and then analyse the subsequent real time simulation with and
without the recommendations implemented. The second project briefly describes how PUMA
was used in conjunction with fast time computer based simulations to provide a more detailed
analysis of workload.

PHARE
PUMA has been used at various development stages of the Ground Human Machine Interface
(GHMI) for the Programme for Harmonised Air Traffic Management Research in
Eurocontrol (PHARE), a European future ATM concept that has both ground and airborne
elements. PUMA analysis was undertaken on video data collected for both tactical and
planner en-route controller position during the PHARE Demonstration 1 trials held in 1995.

PUMA analysis identified areas of high controller workload when the PHARE GHMI was
used under different operating conditions (or operational concepts). PUMA analysts were
then, with the help of subject matter experts (SMEs), able to identify causes of high
412 AR Kilner, M Hook, P Fearnside and P Nicholson

workload, and develop recommendations for improving tool interface design, information
display, controller training and task structuring (Rainback et al 1997).

Using the PUMA data collected from the PHARE Demonstration 1 trial, it was possible to
model the recommended changes to the operational concept. The workload curves produced
by the PUMA model then allowed the proposed changes to be refined before they were
implemented in a later real-time simulation. The results from the revised simulation
operational concept saw a marked reduction in controller workload. Under the original
operational concept the controllers experienced difficulties maintaining control of the sector
and operating in a composed manner. However, under the revised operational concept the
controllers had a planned and structured means of operating, and were able to maintain
control of their sectors.

The PUMA findings were substantiated by comments made by controllers in the debrief who,
when presented with the same traffic sample as used in original operational concept but under
the revised operational concept, said, “It’s a different traffic sample, that’s why it is so much
easier to control”. Another controller said, “you must have made huge changes to the user
interface, because it’s so much easier to use.” In fact, the various changes recommended
from the PUMA findings each required very slight alterations to be made to the interface (for
example colour coding and highlighting).

TOSCA—Fast time model work


Work undertaken within the TOSCA (Testing Operational Scenarios for Concepts in ATM
(an investigation of the concept of free flight in European Airspace)) project used PUMA to
predict controller workload for scenarios generated by fast-time simulations of aircraft
routing strategies. PUMA was used to add a finer level of detail to the workload calculations
associated with fast time models. In this way the hypothetical workload calculations provided
by the fast time model could be augmented with the PUMA data derived from actual
controller-system interactions.

The approach taken was to develop a PUMA model of an operational concept of air traffic
control which was based on PHARE concepts, and use data obtained from fast-time computer
models of air traffic management. The observational task analysis model from PHARE was
augmented with information on conflict resolution strategies obtained by questionnaire-based
interviews with controllers. The PUMA model of workload was then generated by taking a
sequence of events generated by the fast-time simulation and mapping these to events from
the PHARE data. The resultant PUMA model was then used to predict the sequence of tasks
that might be performed by tactical and planner controllers when presented with the
developing traffic situation generated by a fast-time simulation (Kilner et al 1997).

Summary
The workload metric within PUMA is based on Wickens MRT. MRT provides a relatively
sophisticated method of assessing workload in ATM. ATM is a relatively high workload
environment in which the controller must task share. Using MRT enables the analyst to
A predictive model of controller workload in air traffic management 413

ensure the effects of competing resources within these tasks are assessed and measured. It is
intended that in the near future several other workload metrics will also be imported to
PUMA; total time on task, Time Line Analysis and Prediction (TLAP) and Visual Auditory
Cognitive Psychomotor (VACP). All these metrics will be available within PUMA to ensure
that the operational project under consideration has the most appropriate metric available.

The process of validation of the PUMA method and toolset will also begin in the near future.
Initially PUMA, as a task analytic measure of workload, will be assessed against a measure
of primary task performance and, where appropriate, against subjective and objective
measures of workload. These measure are recorded as a matter of course during real time
simulations held at the ATMDC.

PUMA’s greatest strength lies in its ability to not only measure workload but also to predict
the workload associated with, as yet, undeveloped concepts of operation. This process
(described above) allows those concepts of operation that are unlikely to yield benefits to
NATS to be filtered from the development process before the move to more resource
intensive real time simulation. The prediction of workload and the ability to measure
workload supplemented with the ability to accept data directly from fast time computer
models means that PUMA is a highly flexible tool that can be incorporated at various stages
in any design cycle. PUMA can be used to test design concepts at an early stage in the life-
cycle of a project and also undertake a workload comparison between several preferred
options for a particular interface element.

This paper has described the basis of a PUMA analysis of workload, and how PUMA is
applied within operational projects to refine proposed methods of operation, and adds value
to fast time computer simulations.

References
Hook, M.K., 1993, PUMA version 2.2 User Guide, Roke Manor Research Ltd Internal
Report X27/HB/1320/000.
Kilner A.R., and Turley N.J.T, 1984, Development and assessment of a personal weighting
procedure for the ISA tool. ATCEU Internal Note No 69,
Kilner, A., Hook, M., Marsh, D.M., 1996, TOSCA WP8: Workload Assessment. Report
reference TOSCA/NAT/WPR/08.
Moray, N., (ed) 1979, Mental Workload: Its Theory and Measurement, edited by N.Moray,
New York: Plenum Press.
Rainback, F., Hudson, D., Lucas, A., 1997, PD1+Final Report, Eurocontrol PHARE/CAA/
PD3–5.2.8.4/SSR; 1
Wickens, C.D., 1992, Engineering Psychology and Human Performance, New York: Harper
Collins.
ASSESSING THE CAPACITY OF EUROPE’S AIRSPACE:
THE ISSUES, EXPERIENCE AND A METHOD USING A
CONTROLLER WORKLOAD.

Arnab Majumdar
Centre for Transport Studies
Department of Civil Engineering
Imperial College of Science, Engineering and Technology
South Kensington SW7 2BU

European airspace often operates at or beyond capacity, leading to substantial delays and
inefficiencies. The design, planning and management of European airspace is a highly complex
task, involving many issues. Prime amongst these issues is the level of the workload of the air
traffic controllers. This paper considers the issues involved in airspace capacity determination and
its relation to the controller’s workload capacity. The impact of the sector and air traffic features
on the controller’s workload is described, whilst the last section outlines ongoing research work at
Imperial College to estimate airspace capacity given the various factors involved.

1. Introduction.
The air traffic control (ATC) system plays an integral part in the safe and orderly movement of air
traffic and relies upon a balance between technology and air traffic controllers. Air traffic doubled
in Europe during the last decade, much in excess of the predictions upon which the developments
of the national ATC systems were based. A recent study (ATAG, 1992), forecasts that the total
number of flights in Western Europe will increase by 54% from 1990 to 2000, and by 110% from
1990 to 2010, leading to more than 11 million flights a year in Western Europe. This air traffic is
unevenly spread throughout the continent, with a core area where traffic density is the highest.
Modelling studies show that area will expand, there will soon be a situation of almost impossible
air traffic density where the skies are the busiest over Europe. This will lead to an ever-increasing
workload being placed upon the ATC network in Europe and it is generally held that workload
within the ATC system has dramatically risen in the past few years, with further growth is
predicted in the future. There has also been an increase in complexity of traffic. Responding to
this, the aviation authorities in Europe introduced the European Air Traffic Control Harmonisation
and Integration Programme (EATCHIP), administered by EUROCONTROL (European
Organisation for the Safety of Air Navigation) to integrate and harmonise the present, disparate
European ATC systems by 2000, and then to design a future air traffic control system. This relies
heavily upon high levels of technology and automation to aid controllers (Majumdar, 1994).
Currently, in most Western European en-route airspace sectors, controller workload is the primary
limitation of the ATC system—a concept which is difficult to understand, comprising of tasks both
required directly for the control of individual aircraft, and associated tasks. This situation is likely
to remain in the short to medium term and thus controller workload will remain the dominant
factor in determining sector and system capacity, and any modification of the airspace structure
which reduces controller workload should increase airspace capacity.

2. Airspace Capacity
Unlike land transport, the meaning of the term capacity for airspace is non-trivial. In road design
for example, it is possible to estimate capacity in terms of the number of vehicles when the flow on
the road is saturated. Similarly, for a railway line, it is possible to estimate the maximum number
of trains permissible, given the safety requirements and level of signalling technology, on any
particular railway.
Assessing the capacity of Europe’s airspace 415

For certain sectors, EUROCONTROL have available the declared capacity for that sector, in
terms of the number of flights per hour. However, these figures need to be noted with caution.
First of all these figures are the capacities that controllers declare available for their particular
sector for the peak hour at a given time period. Such a figure gives no indication of the traffic mix
which occurs in this figure, such as the proportion of aircraft ascending/descending, nor of the type
of aircraft, e.g. large or small. Furthermore, the values of these capacities change after a period of
time depending upon the season, additional technology, etc. A better indicator of the capacity of a
sector than simply the aircraft per hour is capacity miles measured in nautical miles per hour.
This is defined, for a sector, as the declared capacity multiplied by the average route length within
that sector. This is an indicator of the real capacity of the sector in that it measures the capability of
a sector to resolve “real transportation” problems, including both the traffic flow and flight
distance parameters. It is, in effect, the maximum flight miles which can be controlled in a given
sector over a long period. But again, given that one of the terms is the declared capacity, the
concerns with this figure above, remain.
In considering the capacity of the airspace system, there is a need to consider three factors
(Stamp 1990); the physical pattern of routes and airports; the pattern of traffic demand, both
geographic and temporal; and any ATC routing procedures designed to maximise the traffic
throughput. The prime concern of the ATC authorities is the total number of flights that can safely
be handled, rather than the number of passengers, or other measures. System capacity expressed by
these measures is linked in a complex fashion by average aircraft size; load factors and demand for
non-passenger flights. Forecasts of these depend in turn upon assumptions about market forces,
about diurnal, weekly and seasonal demand patterns, and any anticipated responses of airlines to
these and to any predicted capacity constraint. A second factor in this is the time span for the
capacity estimate, e.g. peak hour or day, a “typical” busy hour or day, or a whole year. Linkages
between such measures are complex and depend upon the diurnal, weekly and seasonal demand
profiles. The tendency has been for such diurnal and seasonal profiles to become flatter. And
finally, there is a need to specify the geographical area considered, e.g. from an individual ATC
sector all the way to the European air traffic network. In general, the wider the area being
considered, the more complex is the task of estimating capacity since such an estimate must be
built up in stages starting with the capacities of the individual airports and airspace sectors and
progressing through to the individual ATCCs.
In a system agreed to be operating at its capacity limit, there is likely to be some room for
extra flights on particular routes or at particular times. This unused “spare capacity” exists
because it is presumably not sufficiently attractive in economic terms for airlines to operate
additional services. Therefore, the total system capacity will not be the simple sum of the
capacities of all the constituent parts, but can only be estimated once the pattern of demand is
specified, with different patterns giving different capacities. A system as complex as the UK or
European ATC network is not static, but instead changes dynamically so as to react to expected
capacity constraints. Consequently, the estimate of total capacity is likely to be more robust if
several parts of the system “saturate” at about the same time. In such a situation, the opportunity
for finding relatively simple ways of alleviating the constraints is much reduced. The
geographical division of airspace over continental areas of Europe is based upon national
boundaries, with straight lines as limits over maritime areas and analysis of this operational
division shows that the limits of a sector have usually been defined in relation to the acceptable
workload within its airspace (EUROCONTROL, 1990). When the traffic levels increase,
subdivision of the sectors has been the normal method of ensuring acceptable workload limits.
However, the biggest regions of Europe have now reached the limit at which further subdivision
results in an increase of the coordination workload outweighing the decrease generated by the
reduction of traffic handled. In addition, the room for manoeuvre in a sector is reduced with
subdivision. In this connection, it should be noted that the airspace as such is not saturated, but
that the flow of data to be processed by the controller is becoming excessively heavy. Therefore,
416 A Majumdar

it is the controller who is saturated in terms of the tasks that he must do. Therefore, in most
Western European en-route airspace sectors—with current level of technology—the air traffic
control sector capacity is determined primarily by the workload of the controller, both for
directly observable tasks as well as mental tasks that also need to be done if the traffic is to be
safely handled. In terms of simple traffic flow into a sector, i.e. the number of aircraft,
irrespective of their size or attitude, the traffic load of the sector defines the average hourly traffic
demand between 06:00 and 18:00. It is an indicator for controller workload (used for estimating
capacity exploitation), and the hourly traffic load is proportional to the routine workload. Often,
instead of the hourly flow, the number of aircraft—ignoring attitude and size—within a sector at
any instant is an equally good indicator for a sector. This represents the number of aircraft a
controller must control at any time, and obviously too large a number will lead to the control
difficulties. This instantaneous load defines the average instantaneous number of aircraft within
a sector and is proportional to the monitoring workload of the controller.

3. Sector Complexity
The concept of sector complexity is one which poses considerable problems. A FAA review of
1995 (Mogford et. al. 1995) based on the literature available defined ATC complexity as a
construct—a process which is not directly observable, but gives rise to measurable phenomena—
that is composed of a number of sector and traffic complexity dimensions or factors. These factors
can be either physical aspects of the sector, e.g. size or airway configuration, or those relating to
the movement of traffic through the airspace, e.g. the number of climbing flights. Some factors
cover both sector and traffic issues, e.g. required procedures and functions. The FAA ATC
complexity term refers to the effect on the controller of the complexity of the airspace and the air
traffic flying within it. In theory, the structure of a sector is separate from the characteristics of the
air traffic. However, when considering ATC complexity, it is not useful to separate these concepts
and consider them in isolation. A certain constellation of sector features might be easy to handle
with low traffic volume or certain types of flight plans. More or different traffic might completely
change this picture. When there is no traffic in the sector, there is no complexity (i.e. there is no
effect on the controller). On the other hand, a given level of traffic density and aircraft
characteristics may create more or less complexity depending on the structure of the sector. Traffic
density alone does not define ATC complexity, but it is one of the variables that influences
complexity and so is a component of complexity. Its contribution to ATC complexity partially
depends on the features of the sector.
ATC complexity generates controller workload and it is the thesis of the FAA review that
controller workload is a construct influenced by four factors. The primary element consists of a
constellation of ATC complexity factors. Secondary components (acting as mediating factors)
include; the cognitive strategies the controller uses o process air traffic information; the quality of
the equipment (including the computer-human interface); individual differences (such as age and
amount of experience). Controller workload originates from the sector and the aircraft within it.
The procedures required in the sector, flight plans of the aircraft, traffic load, weather and other
variables form the basis for the tasks the controller must complete. The amount of workload
experienced by the controller also may be modulated by the information processing strategies
adopted to accomplish required tasks. Such techniques have been learnt in training or evolved on-
the-job and may vary in effectiveness. The influence of a complex ATC environment on workload
can be ameliorated through he use of strategies that will maintain safety through, for example,
simpler or more precise actions. The effect of equipment on workload is also relevant to ATC. The
controller’s job may be made more easier if a good user interface and useful automation tools are
available. This will ensure that adequate and accurate information is presented to the controller to
allow for effective task completion. Personal variables, e.g. age, proneness to anxiety and amount
of experience can also influence workload. Variations in skill between controllers can be
Assessing the capacity of Europe’s airspace 417

pronounced. These factors can have a strong effect on the workload experienced by a given
controller in response to a specific array of ATC complexity factors.

4. The use of RAMS controller workload model in airspace capacity research.


At Imperial College we have begun a study of the factors involved in ATC capacity for Europe, and
their quantitative loading, as a function of the controller’s workload. To estimate the airspace
capacity of Europe’s system, the following steps are required:

• defining the physical characteristics of the system—simple but time-consuming


• once the airspace sectorization has been designed use the RAMS methodology.

To study the effectiveness of control measures on the flows of traffic being simulated, the RAMS
simulator (EUROCONTROL, 1995) developed by EUROCONTROL is used to measure
workloads associated with existing or proposed ATC systems and organisations. Due to its conflict
detection/resolution mechanisms, and its flexible user interfaces and data preparation environment,
RAMS allows the user to carry out planning, organisational, high-level or in-depth studies of the
wide range of ATC concepts. RAMS has access to the EUROCONTROL database maintained for
description and definition of the European aviation environment, e.g. airspace, airports and traffic
loadings, to aid data preparation. It then simulates the flow of traffic through the defined area using
accurately modelled 4-dimensional flight profiles for any one of 300 currently supported aircraft
types. Realistic simulation is helped by the use of advanced conflict detection algorithms and rule
based resolution systems. During the simulation recordings are made to assess the associated
workloads placed upon the relevant controllers required undertake the simulated activity. In the
RAMS model, each control area is associated to a sector, which is a 3-dimensional volume of
airspace as defined in the real situation. Each sector has two control elements associated with it,
Planning Control and Tactical Control, which maintain information regarding the flights wishing
to penetrate them, and have associated separation minima and conflict resolution rules that need to
be applied for each control element. To obtain the tasks for the controllers, the RAMS uses the
ATC Tasks Specification used for the EAM. This lists a total of 109 tasks undertaken by
controllers, together with their timings and position, for a number of reference sectors in Europe.
These tasks are grouped into five major areas; internal and external co-ordinations tasks; flight
data management tasks; radio/telephone communications tasks; conflict planning and resolution
tasks and radar tasks. The reference sectors chosen cover the core European area upper airspace,
and include sectors in the LATCC region, Benelux countries, France and Germany. The capacity of
the sector is estimated by identifying the individual tasks that the controller must undertake, the
time needed to achieve each task, and their frequency for a particular pattern of traffic and
routings. The total workload can then be estimated by summing the time. The capacity of the
sector is the traffic flow such that the estimated workload does not exceed a set criterion.
(EUROCONTROL, 1996). However, the question of the set criterion for controller workload at
capacity is also complex. In the UK, where the DORATASK model of controller workload is used,
the following definition of controller capacity is obtained, agreed with operational ATC managers
(Stamp, 1992): a controller cannot be fully occupied, on average, for than 48 minutes of an hour
and a controller can only work in excess of 57 minutes in any one hour in forty. This guards against
traffic flows that produce workloads in excess of 57 minutes in any one hour. In using the
European Airspace Model, which is the precursor of the RAMS model of controller workload,
there are two values generally used by EUROCONTROL in the interpretation of controller
loadings; the PEAK HOUR PERCENTAGE LOADING and the AVERAGE PERCENTAGE
LOADING. To assist in the interpretation of these loadings, approximate criteria are used to
describe each loading (figures represent % age of 60 minutes):
Severe PEAK HOUR loading: >70% Severe AVERAGE (3 hour) loading: >50%
Heavy PEAK HOUR loading: >55% Heavy AVERAGE (3 hour) loading: >40%
Moderate PEAK HOUR loading: <55% Moderate AVERAGE (3 hour) loading: <40%
418 A Majumdar

The main research concerns determining the impact of various sector, route and traffic factors
on the controller’s workload, and then also determining the “equivalence factors” for the effect of
different types of aircraft movement on the controller’s workload. For the purposes of our
research, the modulating factors on the amount of workload experienced by the controller are
ignored. We take as the starting point the concluding remarks for the study on sector complexity by
the FAA (Mogford et al. 1995), that in any future study of ATC complexity “it would be more
beneficial to focus further investigation on ATC complexity on refining our understanding of the
complexity factors so that intelligent sector design and traffic management studies become
feasible. It should be possible to discover how much each weighting each salient complexity factor
has in determining overall complexity and controller workload. In this way, ATC environments
could be created that have predictable effects on the controller” (p20). Table 1, outlines some of the
sector, route and traffic characteristics whose impact on the controller workload, and ultimately
capacity are to be determined using RAMS. The technique required in this case is response surface
methods to estimate the nature of the surface.
TABLE 1. Some Airspace characteristics in Europe

Another aspect of the research is to develop “equivalence factors” for aircraft movements, similar
to the concept of passenger car units (pcus) is widely used in urban road traffic design, and in particular
at signallized junctions. The saturation flow of a traffic stream at a signal depends upon, amongst
other items, the number of heavy vehicles in the stream: as their number increases, so the saturation
flow decreases. This effect is represented by ascribing passenger car units (pcus) to vehicles of
various classes (heavies, two-wheelers, etc.) so that saturation flow can be written as a constant
number of pcus per unit time. Similarly, intuitively it seems that different types of flow will have
different impact on the capacity of a sector, e.g. it should be relatively easy to control a sector where
there are say twenty flights all cruising in attitude, as opposed to a sector where there are 10 flights
ascending and 10 descending. The additional workload elements involved in the latter case will include
those of greater conflict detection, mental separation calculations, etc. Indeed, from earlier work
undertaken by CAA (Stamp, 1992) with DORATASK, it is known that different types of flight
movements have differential impacts on controller workload. As has been stated earlier, in terms of
air traffic capacity in Europe, it is the controller saturation which determines capacity. These equivalence
factors are just weighting factors which allow account to be taken of the influence of different types
of aircraft attitude on the flow through a sector when the controller is at working at “capacity”, i.e.
flow at “controller saturation”.

References
ATAG (1992) European traffic forecasts, ATAG: Geneva, Switzerland.
EUROCONTROL EXPERIMENTAL CENTRE (1995) RAMS system overview document, Model
Based Simulations Sub-Division, EUROCONTROL, Bretigny-sur-Orge, France
EUROCONTROL EXPERIMENTAL CENTRE (1996) RAMS User Manual Version 2.1,
EEC/RAMS/UM/OO13, Model Based Simulations Sub-Division, EUROCONTROL,
Bretigny-sur-Orge, France.
Majumdar, A. (1994) Air traffic control problems in Europe—their consequences and
proposed solutions, Journal of Air Transport Management, 1(3), 165–177.
Mogford, R.H., J.A.Guttman, S.L.Morrow and P.Kopardekar (1995) The complexity construct
in air traffic control: a review and synthesis of the literature, DOT/FAA/CT-TN92/22,
Department of Transportation/Federal Aviation Administration Technical Center,
Atlantic City, NJ.
Stamp, R.G. (1992) The DORATASK method of assessing ATC sector capacity—an overview,
DORA Communication 8934, Issue 2, Civil Aviation Authority, London.
EVALUATION OF VIRTUAL PROTOTYPES FOR
AIR TRAFFIC CONTROL—THE MACAW TECHNIQUE

Peter Goillau, Vicki Woodward,


Chris Kelly and Gill Banks

Air Traffic Control Systems Group (ATCSG),


DERA, St. Andrew’s Road, Malvern,
Worcestershire, WR14 3PS, UK

In a complex domain such as Air Traffic Control (ATC) it is not always


possible to prototype a system and its human-computer interface for
assessment before implementing the final system. There is a requirement for
a technique which enables a human factors input to be made at the earliest
conceptual stages of the system lifecycle. The MAlvern Cognitive Applied
Walkthrough (MACAW) approach builds on Software Engineering and
Cognitive Walkthroughs, but adds a number of practical and applied
dimensions geared towards the specific needs of the ATC domain. The
MACAW technique was employed during a project to assess different
options for automation in future Air Traffic Management (ATM) systems.
Paper specifications supplemented by user interface visualisations were used
as ‘virtual prototypes’ for preliminary assessment. Results from
representatives of the user population yielded valuable early insights into the
likely benefits, problems and usability of the ATM automation approaches.

Introduction
Conventional human factors wisdom (Hopkin, 1995) advocates prototyping a system and its
Human-Computer Interface (HCI) for expert assessment, before proceeding to implement the
final system. However, constructing a fully working prototype in a complex domain such as
Air Traffic Control (ATC) is a substantial activity, requiring significant resources of time and
manpower. The use of appropriate software tools can ease the prototyping process, but again
the learning curve and the necessary programming support associated with such tools are not
trivial. In the real world, there is a recurring problem of the first human factors assessments
occurring too late, often well into the system development lifecycle. There is therefore a
concomitant requirement for a structured but flexible technique, beyond the so-called ‘cheap
and dirty’ usability methods, which enables a human factors input to be made at the earliest
conceptual stages of the lifecycle when only a paper specification exists. The present paper
addresses this gap in available techniques by extending walkthroughs using paper-based
‘virtual prototypes’.
420 PJ Goillau, VG Woodward, CJ Kelly and GM Banks

Background
The evaluation technique developed and employed by the DERA ATCSG is known as
MACAW (MAlvern Cognitive Applied Walkthrough). The MACAW approach builds on the
work on Cognitive Walkthroughs and Software Engineering design walkthroughs, but adds a
number of practical and applied dimensions geared towards the specific needs of the ATC
application domain.

Pluralistic Walkthroughs
MACAW takes as its starting point the work of Bias (1991) on multi-user, informal
Pluralistic Walkthroughs. Bias’ work at IBM used three types of ‘expert’: product developers,
human factors specialists and representatives of the expected user population. The multi-user
paradigm gave valuable perspectives from different viewpoints, and its informal approach
was found to be a useful way of articulating interface usability issues before building a
prototype or implementing software.

Cognitive Walkthroughs
The walkthrough technique is derived from the code walkthrough review method of Software
Engineering (Preece et al, 1994). As in the software version, the goal of HCI design
walkthroughs is to detect potential problems early on in the design process so that they may
be avoided. System designers ‘walk through’ a set of well-defined tasks, estimating necessary
actions to achieve the tasks and predicting end users’ behaviour and problems. An inherent
problem with this approach—particularly for novel systems—is that the evidence on which to
make reliable predictions of user behaviour may not be available (given that the end users
themselves are not represented in the walkthroughs).
Polson et al (1992) describe their Cognitive Walkthrough as a hand simulation of the
cognitive activities of a user. They also aim to identify potential usability problems by taking
a micro-level, strongly cognitive stance closely mirroring cognitive task analysis. The
walkthrough process is structured around specific questions which embody cognitive
psychological theory, thus the reviews are particularly appropriate for investigating how well
the proposed interface meets the cognitive needs of the intended users. However, a potential
problem once again is that no actual users are involved, so Polson’s approach is heavily
reliant on how well the system designers conducting the walkthrough are familiar with the
cognitive set and the domain knowledge of their intended users. Although the strengths of
walkthroughs are acknowledged, the overly formal and detailed academic stance and lack of
user input are significant drawbacks.

The MACAW technique


MACAW aims to retain the firm cognitive theoretical foundations and multi-user
perspectives of these previous approaches, but also to add a level of practicality relevant to
the evaluation of future Air Traffic Management (ATM) automation options, derived from
experience of conducting human factors ATM trials at DERA Malvern (Kelly and Goillau,
1996). The underlying agenda was to assess the strengths, weaknesses and usability issues of
selected automation concepts, so that an informed decision could be made regarding their
future selection and implementation in ATM projects.
Virtual prototypes for air traffic control—the MACAW technique 421

MACAW key features


• representative users from the scientific and Air Traffic Control Officer (ATCO) populations
• written high-level paper specifications of controller tasks, categorised into the cognitive
activities of Communication/Monitoring/Decision-Making/Planning/ Negotiation (based
on previous task analyses of ATC)
• coverage of the Departure/En-Route/Approach/Landing phases of an aircraft’s typical
flight across Europe
• screen shots of visualised exemplar ATM automation concept interfaces
• framework of semi-structured interviews using a questionnaire format
• video-taping of walkthroughs for later off-line analysis

MACAW questionnaire
A standard questionnaire (Figure 1) was used to facilitate the walkthrough of:

• task actions, associated goals and any task-goal mismatches


• perceived advantages and problems likely to be encountered with the automation
• implications for cognitive processing: memory, perception, understanding, learning
• estimates of ATCO performance, workload and capacity (Goillau and Kelly, 1997)
• estimates of errors and error types: commission and omission
• implications for other issues: timing, system failure, feedback needs, locus of control

Application of the MACAW technique


As part of the European Commission funded ‘RHEA’ project (Role of the Human in the
Evolution of ATM Systems), DERA’s role was to assess selected options for automation in
future Air Traffic Management systems.
For the purposes of the RHEA project, the MACAW walkthrough technique involved
interviewing six subjects as they talked through the ATC scenario and automation options.
Two in-house scientific experts participated in the first sessions as a pilot study. This resulted
in a number of refinements to the experimental procedure and MACAW questionnaire. Four
very experienced, retired ATCOs were then used as subjects in the main experimental
sessions. Each scientific expert or ATCO was walked through individually, by a two-person
team of questioner and observer.
The ATC scenario and three ATM automation options had been defined on paper as high-
level task specifications. Time did not permit the construction of working prototypes. Also
available were screen snapshots of a typical European flight region obtained from the DERA
Real-time ATC Facility and Testbed (RAFT). These snapshots were imported into Microsoft
Powerpoint, and the latter’s drawing facility used to visualise potential exemplar interfaces
for each of the automation concepts. The paper specifications supplemented by the user
interface visualisations were thus used as ‘virtual prototypes’ and provided a starting point
for detailed MACAW discussions.
A semi-structured interview was employed for each walkthrough, using the MACAW
questionnaire as a framework. A written instructions sheet and explanation of the
questionnaire terms were also employed. Each ATC scenario and automation option took
approximately half a day to complete. At the end of the complete set of four walkthroughs,
each subject was encouraged to compare and contrast the three ATM automation concepts,
again using the MACAW questionnaire as a framework.
422 PJ Goillau, VG Woodward, CJ Kelly and GM Banks

Figure 1. MACAW Questionnaire

Observations on MACAW use


There is clearly a limit to the quantity and quality of comments that can be gleaned from
paper specifications and static pictures of exemplar screen interfaces. However, in the
absence of working prototypes the MACAW approach was found to be very effective in
involving controllers and incorporating their wealth of operational experience into the early
assessment process. The interviews provided a valuable insight into the controllers’ opinions
about the ATM automation concepts, and were usefully combined with the opinions of human
factors experts and system designers. The individual interviews yielded a rich set of data,
including the likelihood and the types of errors which might occur. Videoing the interviews
proved to be a good backup mechanism, and having a questioner and observer team worked
well. The MACAW questionnaire was helpful as a standard way of structuring the interview
and looking at different aspects of automation usability, though it was not always possible to
separate out each cognitive component during the walkthroughs.
Considering the ATC scenario and automation options, some trends were evident in the
subjects’ preferences and comments. (These will be covered in a separate publication). A wide
range of views was received from the subjects, though to identify any general ATCO preferences
would require a much larger sample size. The high-level paper specifications were generally
felt to contain insufficient detail and were refined by the ATCOs in the course of the walkthroughs.
All subjects felt that trusting the automation was a major issue, as were the speed and accuracy
of the automation responses. Although all the automation concepts were construed as providing
potential benefits, each ATCO suggested different improvements to the concepts which might
be implemented in the final design. This stresses the importance of including end users as active
participants and stakeholders in the design and evaluation process.
Virtual prototypes for air traffic control—the MACAW technique 423

Conclusions and recommendations


MACAW is a promising extension of the walkthrough approach. It was used successfully to
investigate the strengths, weaknesses and usability issues of potential ATM automation
concepts as ‘virtual prototypes’ prior to formal prototyping and system implementation. It
was found beneficial to combine multiple stakeholder viewpoints from ATCOs and human
factors experts. Some interesting insights into future automation system use and likely errors
emerged in the course of this study. There were found to be problems with insufficient
numbers of subjects and insufficient detail in the scenario specifications. These should be
borne in mind for future MACAW validation work.
At present, the MACAW approach can be commended as a cost-effective technique for
involving the users and initially assessing usability aspects of virtual prototypes in future Air
Traffic Management systems. It requires more extensive validation. The technique might be
suitable for other complex application domains such as command and control, process
control and aerospace.

References
Bias, R. 1991, Walkthroughs: efficient collaborative testing,
IEEE Software, September 1991, 94–95
Goillau, P.J. and Kelly, C.J. 1997, MAlvern Capacity Estimate (MACE)—a proposed
cognitive measure for complex systems. In Harris, D. (ed.) Engineering Psychology
and Cognitive Ergonomics, Volume 1: Transportation Systems, (Ashgate Publishing,
Aldershot), 219–225
Hopkin, V.D. 1995, Human factors in Air Traffic Control, (Taylor & Francis, London)
Kelly, C.J. and Goillau, P.J. 1996, Cognitive Aspects of ATC: Experience from the CAER
and PHARE simulations. Paper presented at Eighth European Conference on
Cognitive Ergonomics (ECCE’8), University of Granada, Spain, 10–13 September
Polson, P.G., Lewis, C., Rieman, J. and Wharton, C. 1992, Cognitive Walkthroughs: a
method for theory-based evaluation of user interfaces, International Journal of Man-
Machine Studies, 36, 741–773
Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S. and Carey, T. 1994, Human-
Computer Interaction, (Addison-Wesley, Wokingham)

Acknowledgements
The RHEA project was part-funded between 1996 and 1997 under the European
Commission’s RTD Transport Programme DG VII Directorate contract No AI-95-SC.107.
The RHEA partners were NLR (Netherlands), Sofréavia (France), Thomson-CSF Airsys
(France), NATS (UK) and DERA (UK). The views expressed in this paper are the authors and
are not necessarily those of DERA, the other RHEA partners or the European Commission.

© British Crown Copyright 1998/DERA

Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office
DEVELOPMENT OF AN INTEGRATED DECISION MAKING
MODEL FOR AVIONICS APPLICATION

Doug Donnelly*, Jan Noyes** and David Johnson*

** Faculty of Engineering, University of Bristol


Bristol BS8 1TN, UK

** Department of Experimental Psychology, University of Bristol


8 Woodland Road, Bristol BS8 1TN, UK

Ever since the first commercial flight, the amount and complexity of
information available to flight deck crew have continued to increase.
Although modem avionics systems have provided many benefits for aircraft
operations, the advent of automation can lead to a decrease in crew
awareness, especially in abnormal situations. One solution is the
development of error tolerant systems which not only aid the crew in
detecting and diagnosing problems, but also provide feedback to the crew on
their actions. This paper will propose an integrated decision model, which
has been developed taking into account the known characteristics of the civil
flight deck decision environment and the human decision making
capabilities of crew, with a particular focus on situation awareness. The
model identifies points in the decision process where errors may be made
and suggests that these may be used as intervention points for decision
support, to prevent errors or to help recover from them.

Introduction
The flight deck is a unique environment for decision making: it is complex, dynamic, subject
to distractions, time pressure and at times an overload of information. Consequently, in order
to design systems for use on the flight deck, it is essential to understand how crew act under
certain conditions, how they respond to certain situations, and most importantly how they
make decisions (Abbott et al., 1996). There have been several attempts in the past to form
models of human decision making; the most successful of which tend to be based on
Naturalistic Decision Making (NDM) theories. NDM research considers decision making in
operational settings with experienced operators and so it is eminently suited to the civil flight
deck. However, due to the unique characteristics of the flight deck, there are certain aspects
of crew decision making which do not seem to be covered by existing NDM models and
theories. Furthermore, those theories which are aimed specifically at aviation decision
making do not seem to capture the complete picture. A model of crew decision making is
Integrated decision making model for avionics application 425

needed which can bridge the gap between understanding decision making and improving it.
This paper outlines an Integrated Decision Model (IDM) which draws on other theories but is
more descriptive of crew behaviour and applicable to supporting crew decision-making. This
model highlights areas of weakness in decision making and the types of errors that can be
made; it may also be used to point to areas where these errors may be detected, and where
decision support may intervene to correct them.

An Integrated Decision Model


There are certain characteristics of crew decisions which are essential to understanding flight
deck decision making. First and perhaps most important is ‘Situation Awareness’ (SA). Many
believe that a good SA is the key to effective decision making (Orasanu, 1995). Endsley
(1994) described SA as consisting of three levels: level 1 SA concerns the perception of
events, level 2 involves a comprehension of these events, and level 3 is the projection of
future developments. This is very similar to Klein’s Mental Representation (MR) which he
outlined in his Recognition-Primed Decision (RPD) model (Klein, 1993). This MR consists
of knowledge of what is happening (similar to level 1 SA), knowledge of the rules governing
the situation (level 2 SA), and knowledge of possible consequences, or expectancies for the
future (level 3 SA). The key difference in aviation decision making is that the crew begins
with a high SA which may degrade over time, unlike other experienced decision makers such
as fire fighters, who acquire SA as the situation clarifies. This is an important reversal since a
potential for error occurs when SA degrades (i.e. when the crew’s MR differs from the real
situation), as opposed to when a situation is not correctly assessed.

Figure 1. The proposed Integrated Decision Model


426 D Donnelly, JM Noyes and DM Johnson

It shows that the crew’s MR and the difference between this and the actual situation, play
a key role in the decision process, since any decisions are based on this. In the case of the
flight deck, where procedures have been previously determined, experience is essential in
matching the information and cues to a familiar situation, in order to maintain the MR and to
know which procedure is relevant. This is where Klein’s RPD model is most appropriate.
The model shows that there are three paths which the crew may take in making a decision. If
there is not enough information, or the situation is complex, s/he may seek more information to
clarify his/her representation of the situation. If the crew is satisfied with the representation, s/
he may form intentions to act, may consider the consequences of these actions, and may even
perform a mental simulation such as that described by Klein (1993). However, under certain
circumstances, a short cut may be taken, which bypasses this process of forming intentions and
considering consequences. When a situation is routine or if there is time pressure, the person
may act or react automatically. This automaticity seems to be the key to the versatility of human
decision making and problem solving, but can also be its downfall.
Finally, there will be effects and consequences of the crew’s actions, or failure to act.
Since aviation decision making is a continuous process, these effects will feedback to the
crew in the form of changing events and trends. This feedback is a vital part of decision
making. It is a kind of fail-safe, a way of detecting errors and correcting them. If the situation
changes unexpectedly, or if the feedback does not correspond to the crew’s MR, they will
have to adjust their MR, or take a fresh look at the situation. Many actions or errors on the
flight deck are recoverable, but if the crew is distracted, is operating in poor conditions (e.g.
at night or in bad weather), or has too high a workload, this vital feedback may be missed.
This is where errors turn into accidents.

Experimental Study
An important aspect of the IDM and one which often leads to error, is the short-cut path
which bypasses the consideration of consequences. The two main conditions under which
this route is taken are time pressure and automaticity (when actions or situations become
routine). An experimental study was undertaken to examine the actions of decision makers
under these conditions. It was hypothesised that decision makers do not consider the full
consequences of their actions under conditions of time pressure and automaticity, and
therefore make more errors under these conditions.
Participants were asked to control a process control task, analogous to a flight deck
scenario with some of the key characteristics of NDM. The task was a computer simulation of
an industrial distillation plant. The user had to regulate the distillation of alcohol to a desired
level of purity, while producing as much distillate as possible. There was also a secondary,
event-handling task where they were asked to respond to a message. There were two possible
responses, thus requiring a simple decision to be made by the user. The frequency of
occurrence and type of event were controlled such that the response became increasingly
automatic. The time allowed to respond to the event was also controlled so that the
participants were placed under time pressure.
There were 18 male and 11 female participants: 14 in the control condition and 15 in the
experimental condition (high time pressure and automaticity). Their mean age was 24.9 years,
with a standard deviation of 2.09 years. An independent subject’s design was used for both
trials, with participants randomly allocated to conditions. Automaticity and time pressure were
Integrated decision making model for avionics application 427

used as independent variables, while the number of correct responses to an event and the response
times were the dependent variables. The participants’ approach to the process control task was
also important since this would directly affect their responses to the events. For example, the
process was not easy to control and required a high mental workload in order to collect the
required volume and purity of distillate. However, by taking a less active approach, a less pure,
lower quantity of alcohol is collected and the task becomes much simpler. The overall purity
and volume of the collected alcohol therefore gave an indication of the approach taken by the
participant. Finally, a questionnaire was completed by each participant to provide a subjective
measure of automaticity, time pressure, performance and motivation.

Results
The results showed that, although time pressure greatly affected the way in which the participants
made decisions, the increasing routineness or automaticity, of the decisions had little effect.
This was shown in both the number and types of incorrect decisions made under time pressure,
and by secondary measures such as the response times and the questionnaire answers. However,
the questionnaire results suggested that there did exist a certain amount of automaticity. The
participants found themselves responding automatically to events, but correcting themselves in
time. This may relate to Rasmussen’s skill or rule level of information processing (Rasmussen,
1993), rather than the knowledge level. Rasmussen showed that different tasks required differing
levels of mental processing, according to the nature of the task. He also showed that humans
made errors in knowing which level of processing is appropriate. This may correspond to the
different paths in the IDM, with the skill and rule processing levels relating to the short-cut path
used under time pressure and automaticity, and the knowledge level being represented by the
consideration of consequences.
The objective results did show that under time pressure or distractions, the decision
maker’s actions could be automatic and this might imply that there are two kinds of
automaticity, simple and complex, again relating to Rasmussen’s different levels of skill/
knowledge. The event-handling used in the experiments would have dealt with ‘simple
automaticity’, i.e. skill/rule level decisions (yes/no decisions) which are simple enough not to
warrant any real consideration of consequences under any conditions, but which, under
conditions of time pressure or distraction can lead to mistakes or slips. ‘Complex
automaticity’ would involve decisions which require knowledge level processing in order to
be made correctly, but which would, if the same decision was made frequently enough,
encourage the decision maker to become complacent and not use this knowledge level.

Future Directions
It is important that the proposed model for decision making is validated, both in terms of its
suitability to the flight deck and in its conclusions about crew behaviour. This could be done
through further experimental studies to investigate the actions of decision makers in various
situations. The main aim of such validation would be to determine the key areas of weakness
in crew decision making. If the IDM proves to be a valid model of the decision process, it
would then be possible to use it in the improvement of decision making on the flight deck.
This improvement may come through a combination of training, procedures and the design of
systems to support the crew’s decisions.
428 D Donnelly, JM Noyes and DM Johnson

It would seem, from the proposed IDM, that two of the main areas of weakness in the
decision process, are the formation and maintenance of an accurate MR, and the
consideration of the consequences of actions or inaction. A major problem however, is that
these two areas lie within the internal decision process (shown by the dashed line). This
makes it extremely difficult to know when an error has been made and what type of error it is,
thus making intervention from flight deck systems unfeasible. The only means of interaction
with the crew is through the actions they perform and the information presented to them. This
is why solutions to human error have traditionally relied upon training to improve the internal
decision processes, and the improvement of information displays.
Many of the systems on the flight deck, such as information displays, tend to be error-
preventative, as opposed to error-tolerant. However, using the IDM it may be possible to
design error-tolerant decision support systems. Traditional decision support systems have
generally been based on normative decision models, and are not designed for use in situations
where time is short and information is not freely available. Many decision support systems
proposed for use on the flight deck rely on artificial intelligence technology which is not yet
available, and are not based on an understanding of crew decision making.
The IDM highlights areas where decision support could intervene to aid the crew. An
effective intervention point for decision support would be to provide feedback on the effects
or consequences of crew actions. This could also help to clarify or even restore crew situation
awareness. Such a system would essentially be a warning system which gathers information
on the aircraft and its present environment, so that it can provide the crew with an accurate
picture of the situation. If there is reason to believe that the aircraft is in an unsafe condition,
or if the crew’s actions may place the aircraft in danger, the system could inform the crew
of this.
It is important that the design of flight deck systems is based on an understanding of the
way in which the crew act and make decisions. They must allow the crew the freedom to use
their own decision strategy, whilst providing support in potentially unsafe conditions. The
proposed Integrated Decision Model, if validated, will allow such systems to be more
effective and may lead to improved decision making on the flight deck.

References
Abbott, K., Slotte, S. and Stimson, D. 1996, The Interfaces Between Flightcrews and
Modern Flight Deck Systems. (Report of the Federal Aviation Administration Human
Factors Team). Department of Transportation, Washington, DC
Orasanu, J. 1995, Situation Awareness: Its role in flight crew decision making. In
Proceedings of the 8th International Conference on Aviation Psychology, Columbus
Ohio
Endsley, M. 1994, Situation Awareness in dynamic human decision making: theory. In R.
Gilson, D.Garland and J.Koonce (eds.), Situation awareness in complex systems,
(Embry-Riddle Aeronautical University Press, Daytona Beach, Florida) 27–58
Klein, G. 1993, A recognition-primed decision model of rapid decision making. In G. Klein,
J.Orasanu, R.Calderwood and C.Zsambok (eds.), Decision making in action: Models
and methods, (Ablex, New Jersey)
Rasmussen, J. 1993, Deciding and doing: decision making in natural contexts. In G. Klein,
J.Orasanu, R.Calderwood and C.Zsambok (eds.), Decision making in action: Models
and methods, (Ablex, New Jersey)
PSYCHOPHYSIOLOGICAL MEASURES OF FATIGUE AND
SOMNOLENCE IN SIMULATED AIR TRAFFIC CONTROL

Hugh David

Eurocontrol Experimental Centre


91222 Bretigny-sur-Orge, CEDEX, France

P.Cabon, S.Bourgeois-Bougrine, R.Mollard

Laboratoire de l’Anthropologie Applique


45 Rue des Saints-Peres
75270 Paris, France

Eight Air Traffic Controllers carried out exercises using a TRACON


II Air Traffic Control (ATC) Simulator. After a training and
familiarisation day, the controller carried out four simulation
exercises, two low and two high traffic load. His performance during
each exercise was recorded, A self-assessment questionnaire for
fatigue and a test of event-related potential (ERP) were applied and a
sample of saliva was taken for cortisol analysis before and after each
experimental session. The NASA-TLX was completed after each
exercise and a test of alpha-rhythm attenuation was carried out at the
start and end of each day. cortisol analysis, ERP and fatigue/sleep
questionnaires are recommended for further use in Real Time (RT)
simulations.

Introduction
The Eurocontrol Experimental Centre has probably the longest and certainly the
widest experience of large-scale RT simulation of Air Traffic Control (ATC). One
continuing concern has been the ‘objective’ measurement of the effects of carrying
out ATC on the controller. Recent developments in ATC (Brookings et al, 1996,
using a TRACON/Pro simulator), and elsewhere (Fibiger et al, 1986—hormonal
responses to stress, Kramer et al, 1987—Event-Related Potentials) suggested that
sufficient progress had been made to justify a further study of the feasibility of
measuring physiological and psychological correlates of stress in ATC.
430 H David, P Cabon, S Bourgeois-Bougrine and R Mollard

Experimental Equipment and Design


The TRACON II simulator is a PC-based RT simulator, displaying a simulated
radar picture, with tables of strips for actual and expected traffic. Control orders
were inserted via the keyboard, and a speech generator ‘spoke’ the controllers’
orders and pilots’ communications. A labelled map of the (London) TMA area
employed, a table of data for the eight airports involved, and a table interpreting the
keyboard orders were provided at all times. Exercises were nominally thirty
minutes long, but continued until all the traffic had left the area, requiring up to 20
minutes after the last aircraft entered.
Eight male Air Traffic Controllers carried out individual exercises using a
TRACON II ATC Simulator during two days each. During the first day, the
controller carried out an initial exercise during which the displays and controls
were demonstrated. The controller was familiarised with the EEG and other
procedures employed during testing. Two exercises with low and high workload
respectively were then carried out, during which the controller was prompted and
assisted as necessary. During the second day, the controller carried out four
simulation exercises, two with a low and two with a high level of traffic load. The
high level of traffic corresponded to 20 or 30 aircraft entering in 30 minutes,
depending on the controllers’ estimated capacity. The low level was half the high
level. Although the same exercises were used throughout, no exercises were seen
twice by the same controller.

Measurements
The controller’s performance during each exercise was recorded. A self-assessment
questionnaire for fatigue and a test of event related potential (ERP) were applied
and a sample of saliva was taken for cortisol analysis before and after each
experimental session. The NASA-TLX was completed after each exercise and a test
of alpha-rhythm attenuation was carried out at the start and end of each day. A test
of ERP was also carried out during simulation exercises, where the controller was
able to carry it out. (The ERP test required the controller to listen to 150 auditory
tones, counting the high frequency tones—about one third. The process takes about
five minutes, and appears to the controller as a potentially distracting secondary
task.) The controller filled in questionnaires concerning his sleep pattern before,
during and after the two days experimentation.

Results

Sleep loss
Controllers showed no significant changes in the time for which they slept, and felt
less sleepy and tired on waking.
Measures of fatigue and somnolence in simulated air traffic control 431

NASA-TLX
Controllers rated the higher loaded exercises to be more difficult. The mental
demand component was rated highest for the high traffic session at the end of day
one. Controllers rated their performance higher after the high load exercises,
although objective performance, measured as the ratio of the TRACON score to the
maximum value decreased.

Alpha Attenuation Test


The alpha attenuation test (Stampi et al, 1995) compares the proportion of alpha-
rhythm observed in the EEG when the eyes are shut with that when they are open.
In principle, when a controller becomes sleepy, his alpha rhythm should decrease
when his eyes are closed and increase when they are open, the ratio of alpha power
eyes closed/open is the alpha attenuation coefficient (AAC). A high AAC implies
high alertness and vice-versa. In this experiment, no significant effect was
observed.

Subjective ratings of sleepiness and fatigue


A clear circadian rhythm, with a marked post-lunch dip, was observable. Sleepiness
and fatigue were generally closely related, although they diverged on the afternoon
of the second day, where controllers felt less sleepy, but more fatigued.

EEG—Spectral Analyses
Theta rhythm (4–7Hz) was high during the high traffic training session of day 1,
consistent with the view that theta activity is related to learning processes. During
the measured exercises, there was a shift from low frequency (Delta and Theta
rhythm 1–7Hz) to higher frequencies (Alpha and Beta rhythm 8–30Hz) for higher
workload samples, consistent with increased alertness.

ERP—Event Related Potential.


The relative amplitude of the P300 potential decreased significantly after high load
exercises, suggesting a measured fatigue effect attributable to the higher workload.
(P300 amplitude during the exercise decreases considerably, but the technique is
not practically applicable in RT simulations.)

Cortisol
Salivary cortisol normally shows a strong circadian rhythm, and did so here.
Comparisons were therefore made in terms of the change in cortisol levels after an
exercise compared with the level measured before. During the training day there
was a significantly greater increase in cortisol for the high-load exercise than for
the low-load exercise. During the measured day, no such effect was observed. The
controllers, however, reported, via the NASA TLX, a higher subjective workload.
This discrepancy may be attributed to the controllers, although they perceived a
workload difference, being better able to cope with it.
432 H David, P Cabon, S Bourgeois-Bougrine and R Mollard

Individual Cortisol Rates


There were significant differences between the four controllers having higher
cortisol levels (HC) and the four having lower cortisol levels (LC). During the
training day, the HC group performed significantly better, and tended to rate the
workload higher. During the measured day, however, the HC group performed
significantly better in low traffic, but showed a strong decrement in performance in
high traffic. The HC group showed a marked increase in cortisol after the high
workload training exercise, and after the afternoon high workload measured
exercise, during which their performance deteriorated. Consideration of subjective
and objective evidence of sleepiness and fatigue suggests that the HC group were
more ‘fatiguable’ than the LC group. They slept longer during and after the
experiment, and felt more tired when they woke.

Discussion
This feasibility study, although based on relatively few subjects, has demonstrated
that various objective measures can be relevant to ATC. These results differ in some
respects from those of Brookings et al (1996). Brookings’s subjects were younger
and less experienced than those in this study, and received six hours of training,
rather than the four used here. Brookings compared short sequences of 15 minutes
of low, medium and hard workload, defined by manipulating task complexity (pilot
skill, traffic mix) or traffic volume (6, 12, or 18 aircraft), without considering the
relative capacity of the controllers.
Most measures of fatigue showed high sensitivity to the effect of task demand.
ERP showed relatively lower amplitude immediately after high workload exercises.
Fatigue and sleepiness tend to dissociate under these conditions. The demands of
the task require a high level of alertness, which induces further fatigue. (Similar
effects are observed elsewhere—for example in the later stages of long-distance
flights—Cabon et al, 1996.) ERPs cannot practically be recorded during
simulations, both because they form a distracting secondary task, and because, in
RT simulations, the controller speaks, which disrupts the ERP signal. Similar
problems arising from speech and movement affect other electrophysiological
measures, such as heart-rate variability, respiration rate, or overall EEG
observations.
Salivary cortisol is relatively easy to collect. There appear to be significant
differences between low and high cortisol individuals. High cortisol individuals
tend to be more affected by (simulated) ATC than low cortisol individuals.
Subjective measures are sensitive, and easy to administer. The elaborate cross-
comparison of scales appears redundant to controllers, and may be omitted.
Where RT simulations are concerned, the physical workload scale may well be
redundant.
It is clear that the underlying pattern of sleep distribution may affect, and be
affected by the learning or practice of ATC, whether as shift-work in the real
workplace, or where controllers are displaced from their normal environment while
participating in RT simulations. ‘Sleep logs’ as used in this study are relatively
Measures of fatigue and somnolence in simulated air traffic control 433

cheap and simple, but they may need to be checked by the use of an ‘actometer’—a
sophisticated wrist-watch like device which records sleep quantity or quality.

Conclusions

Transfer to Real Time Simulation


Determination of sleep patterns by sleep log/actometry and subjective self-
evaluation of sleepiness throughout the simulation.
Evaluation of stress reaction by salivary cortisol concentration—12
controllers—before and after selected exercises.
Evaluation of psychophysiological impact of task load using ERP. One or two
controllers, on heavily loaded sectors measured each day.

Further TRACON Studies


More detailed studies of brain function, examining topographical effects (where in
the brain does activity take place?) as well as Fourier Analysis (what frequencies
are involved?).
Other measurement methods—eye-movement, blink rate, pupil diameter.
Other control input methods (speech/on-screen graphical etc.)
Effects of ‘less skilled’ computer-generated pilots.

References
Brookings J.B., Wilson G.F. and Swain, C.R. 1996, Psychophysiological responses
to changes in workload during simulated Air Traffic Control, Biological
Psychology, 42, p361–377
Cabon P., Mollard R., Mourey F., Bougrine S. and Coblenz A. 1996, Towards a
general and predictive model of fatigue in aviation In Proceedings of the
fourth Pacific Conference on Occupational Ergonomics, Taipei, Taiwan, pp
622–625.
Fibiger W., Evans O. and Singer G.1986, Hormonal Responses to Graded Mental
workload, European Journal of Applied Psychology and Occupational
Psychology, 55, pp 339–43
Kramer A.F., Donchin E. and Wickens C.D. 1987, Event-Related Potentials as
indices of mental workload and attentional allocation, In Electrical and
Magnetic Activity of the Central Nervous System: Research and Clinical
Applications on Aerospace Medicine. AGARD Conference Proceedings No.
432, pp 14–1 to 14–14
Stampi, C., Stone P. and Michimori A. 1995, A new quantitative method for
assessing sleepiness, the Alpha Attenuation Test , Work and Stress, 9(2/3), pp
368–376
DRIVERS AND
DRIVING
WHAT’S SKILL GOT TO DO WITH IT? VEHICLE
AUTOMATION AND DRIVER MENTAL WORKLOAD

Mark Young & Neville Stanton

Department of Psychology
University of Southampton
Highfield
SOUTHAMPTON SO17 1BJ

Although vehicle automation is not unfamiliar on today’s roads, future


technology has the potential to reduce driver mental workload in addition to
relieving physical workload. Previous work in our laboratory has determined
that mental workload decreases significantly as more levels of automation
are introduced. The current paper addresses the question of whether this
picture changes across levels of driver skill, by measuring the mental
workload of drivers at four different levels of skill, and under four different
levels of automation. The preliminary data reported in this paper
demonstrate that level of driver skill has no effect on subjective mental
workload, however it does interact with level of automation on a secondary
task measure. These results are interpreted with respect to potential effects
on performance, with implications for safety on the roads of tomorrow.

Introduction
There are a number of vehicle technologies on the horizon, some of which are intended to
assist drivers in their task (e.g., navigation aids), whilst others are designed to relieve the
driver of certain aspects of the driving task. So far, automation has merely operated at a
physical level, however future systems seek to take over some psychological elements of
driving. It is these latter devices which we are concerned with. Adaptive Cruise Control
(ACC) assumes longitudinal control of the user’s vehicle, controlling both speed and
headway, whilst Active Steering (AS) copes with lateral control, keeping the car within its
lane. Both devices are expected to be on the road within the next decade, and both have the
potential to reduce driver mental workload.
Although this may at first sound advantageous, the arrival of automation is accompanied
by a whole new set of problems. Stanton & Marsden (1996) use the history of automation
problems in aviation as a basis for summarising the potential problems which may arise in
road vehicle automation. For instance, overdependence on the automated system may lead to
skill degradation, and this in turn could result in more serious consequences in the event of
system failure. These views have been espoused by other noteworthy researchers in the field
(e.g., Bainbridge, 1983; Norman, 1990; Reason, 1990), and there is a general consensus that
supervisory control (cf. Parasuraman, 1987) is not a task best suited for humans.
One specific problem associated with automation is that of mental workload (MWL).
Automated systems have the potential for imposing both underload and overload. Under
normal circumstances, operators are faced with fewer tasks than they were previously able to
cope with, however in an automation failure situation, they are immediately forced into a
situation of overload (cf. Norman, 1990). It is precisely this problem which our research is
concerned with. Underload is at least as serious an issue as overload (Leplat, 1978; Schlegel,
1993), however its effects on performance have not been fully documented as yet. The
Vehicle automation and driver mental workload 437

present paper describes the latest in a series of studies designed to examine the relationship
between automation, mental workload and performance.

Previous research
There is very little work in the public domain specifically directed at evaluating vehicle
automation. Nilsson (1995) used a driving simulator to investigate the effects of ACC on
performance in critical situations. In comparison with manual driving, ACC drivers were found
to be four times as likely to crash in the situation where collisions occurred (approaching a
stationary queue). However, Nilsson (1995) did not find any workload differences between the
groups on a subjective measure (the NASA-TLX; Hart & Staveland, 1988). Instead, performance
differences were attributed to drivers’ expectations about the ACC system.
Most of the other research of which we are aware in this field has been conducted in our own
laboratory—the Southampton Driving Simulator. Stanton, Young & McCaulder (in press)
explored the effects of ACC failure on performance. Faced with a malignant failure scenario,
one-third of all participants crashed into the lead vehicle. In this experiment, a secondary task
did demonstrate a significant difference in MWL between manual and ACC-assisted driving.
With an apparent discrepancy between the workload results of the two studies described
above, Young & Stanton (in press) performed a detailed experiment into automation and
workload. Participants were asked to drive under four levels of automation: manual, ACC,
AS, and ACC+AS. Both the secondary task and the NASA-TLX were used to measure
workload, and the pattern of results for each was identical. Using ACC alone did not reduce
MWL significantly when compared to manual driving, however there was a significant
reduction when AS was engaged, and a further significant drop when both devices are used.
These studies are discussed further by Stanton & Young (1997).

Skill and automaticity


With these results in mind, then, the major theme of the most recent experiment addresses the
question of whether this pattern of results changes across levels of driver skill, and in
particular, degree of automaticity.
Driving is a skilled activity which is a classic example of an automatic behaviour (Stanton
& Marsden, 1996). Such overlearned responses can prove advantageous, as demonstrated by
the braking response to vehicles pulling out into the path of others (see Nilsson, 1995), or
they can have adverse consequences, as when drivers have a strong expectation that a junction
will be clear and fail to see oncoming traffic (Hale, Quist & Stoop, 1988).
When automation is engaged, all drivers—novices and experts alike—essentially satisfy
the criteria for automaticity (that is, fast, attention-free, unconscious processing). If a
situation of increased demand essentially transforms an expert into a novice (cf. Bainbridge,
1978), it is surely plausile to assume the reverse would be true in a situation of unusually low
demand (i.e., driving with automation). However, whereas the expert has an increased
knowledge base to draw upon, the novice is deprived of this. Therefore, it is important to
understand how automation may affect MWL and performance for drivers of all skill levels.
By measuring the mental workload of drivers at four levels of skill, and under four levels of
automation, it should be possible to determine how automation and automaticity interact.
Furthermore, Liu & Wickens (1994) claim that whilst subjective workload is influenced by the
presence of automation, a secondary task can discriminate automatic from nonautomatic
processing. Therefore, the present study extended that of Young & Stanton (in press) by repeating
the procedure for drivers at three additional levels of skill: novice, learner, and advanced.

Method
A mixed between-and within-subject design was used. Level of automation constituted the
within-subjects variable, with four levels: manual, ACC, AS, and ACC+AS. Driver skill level
was the between subjects factor, again with four levels: novice (i.e., never driven before), learner
(currently learning but does not hold a full licence), expert (holds a full licence), and advanced
(member of the Institute of Advanced Motorists in the UK). The latter group was chosen as a
438 M Young and N Stanton

high level skill group because the Institute of Advanced Motorists (IAM) provides further training
for drivers with a full licence, and are statistically 75% less likely to be involved in an accident
than other drivers without such training. Number of participants in each group was 12, 20, 30
and six respectively. For the expert group, the same data were used as that gathered by Young &
Stanton (in press), and the general procedure for all groups was as described in that paper.
Dependent measures were the NASA-TLX for subjective workload, administered
immediately following each trial, and a visuo-spatial secondary task, designed to occupy the
same attentional resources as driving. This was treated as an additional task to driving, thus
used as a measure of spare attentional capacity. The variable treated to analysis was number
of correct responses to the secondary task during the trial. A series of primary task data were
also recorded, however these will not be covered in this paper.

Results
A repeated measures analysis of variance (ANOVA) was performed on each of the NASA-
TLX and the secondary task sets of data, including level of experience in the model as a
between-subjects factor. Only the overall workload (OWL) score was analysed for the
NASA-TLX, more detailed investigations on the subscales are beyond the scope of this paper.
It must be emphasised that these analyses were performed on a reduced data set, and more
robust conclusions will be available in the near future.
For the OWL score then, a significant main effect was found for the within-subjects factor
of level of automation (F3, 192=151.99; p<0.001). There were no significant results for level of
experience, nor for the interaction between the two variables.
Further exploration revealed that the effect of automation was exactly as expected on the
basis of Young & Stanton’s (in press) work. That is, manual and ACC assisted driving did not
differ in workload (t67=2.04; p=0.045), however there was a significant reduction when AS
was engaged (t67=13.07; p<0.001), and a further significant reduction when both devices were
used (t67=8.57; p<0.001). Significance levels in these tests have been adjusted to allow for the
possibility of type I error. The results are illustrated in figure 1.

Figure 1. Mean OWL score for each skill group at all four levels of automation

The secondary task data were more intriguing. Again a significant main effect emerged for
level of automation (F 3,180=204.77; p<0.001), and the effect of experience alone was
nonsignificant. However, a significant interaction did arise between the two variables
(F9,180=2.26; p<0.025). The data are represented graphically in figure 2.
Paired comparisons of the data for level of automation, pooled across experience levels,
reveal a similar pattern to the NASA-TLX data. However, now there is also a marginally
significant difference between manual driving and ACC supported driving (t66=–2.53;
p<0.02), such that there is reduced MWL in the latter. The significant workload reductions
Vehicle automation and driver mental workload 439

for driving with AS (t64=–12.23; p<0.001) and for driving under full automation (t64=–11.53;
p<0.001) still stand.

Figure 2. Mean secondary task score for each skill group at all four levels of
automation

To tease out the interaction, these paired comparisons were repeated within each level of
experience. For the learner and expert groups, the pattern resembles the now familiar results
obtained previously. That is, no significant difference between manual and ACC driving, however
using AS reduces workload significantly, and using both ACC+AS has a further significant
effect. For the novices, the effects of AS and ACC+AS are the same, however ACC marginally
reduces workload when compared to manual driving (t11=–3.39; p<0.01). Similarly, in the
advanced group there again appears to be more of a stepwise reduction in workload across the
four levels of automation, although in this case the statistical significance is slightly less
conclusive. There are marginal differences between manual and ACC (t5=–4.44; p<0.01), between
ACC and AS (t5=-3.59; p<0.02) and between AS and ACC+AS (t5=–3.57; p<0.02). We are
cautious in interpreting these data, given the number of tests performed and the sample size
used. There are clearly significant differences between manual and AS driving (t5=–6.13;
p<0.005), and between ACC and ACC+AS driving (t5=5.14; p<0.005).

Discussion
If, as is the claim of Liu & Wickens (1994), subjective workload measures are influenced by
the presence of automation, then the current study indicates that this factor is completely
independent of experience. That is, levels of vehicle automation have the same effect on an
individual’s perception of MWL whether they are a complete novice or an advanced driver.
This consistent effect demonstrates that ACC alone has no effect on subjective workload,
however AS does reduce perceived MWL, and this reduction is augmented further when both
ACC and AS are used simultaneously.
The secondary task, which according to Liu & Wickens (1994) can discriminate automatic
from nonautomatic processing, paints a more interesting picture. As far as learners and
experts are concerned, the pattern of results mirrors that for the NASA-TLX. This is an
expected result, for the secondary task is also a widely accepted measure of MWL (e.g.,
Schlegel, 1993). However, this view is upset when the novice and advanced groups are
considered. Here, there seems to be a more stepwise reduction in workload as more levels of
automation are introduced. Although some of the results were statistically inconclusive, we
feel that with the increased data set we intend to collect, this pattern will become more robust.
That is, these groups perform better on the secondary task when driving with ACC+AS than
they do with AS alone, which in turn is better than ACC alone, and finally performance with
ACC alone is better than when driving manually.
440 M Young and N Stanton

As far as the automaticity paradigm is concerned, this is an extremely intriguing finding.


The fact that the influence of automation on drivers’ spare attentional capacity is mediated by
their level of experience means this is certainly an area which merits deeper exploration.
Although it is currently difficult to explain why the results for novices and advanced drivers
should be equivalent, further research should be able to shed some light on the matter. For
now, it is perhaps interesting to note that part of the IAM assessment includes a running
commentary on one’s driving. This is in deference to known research about expert
performance, as automatic behaviours should be processed unconsciously. To say that a
highly trained person must perform always at a conscious level is something of a paradox (cf.
Barshi & Healy, 1993), however the merits of such processing in an unexpected situation
cannot be denied. Indeed, in this respect the advanced driver does perform in a similarly
declarative manner as the novice, albeit at a much higher level of abstraction. In light of this,
perhaps the equivalence of responses in these two groups should not be so surprising.

Conclusions and future research


It has been found that subjective MWL in a driving task is heavily influenced by the presence
of automation, irrespective of the driver’s level of experience. However, a secondary task
measure does reveal differences in the pattern of responses across levels of skill. This is
interesting from the perspective of relating automation to automaticity, for the secondary task
is a measure of automaticity as much as it is of workload.
The most obvious next step is to analyse the primary task performance data. It may be the
case that intermediate levels of automation adversely affect performance under normal
circumstances. For instance, a driver’s steering performance may be different if driving with
ACC than under manual conditions. Similarly, longitudinal control may be affected by the
introduction of AS. Using these data in conjunction with the workload results reported here
will help us understand whether mental underload is detrimental to performance.

References
Bainbridge, L. 1978, Forgotten Alternatives in Skill and Work-load, Ergonomics, 21, 169–185
Bainbridge, L. 1983, Ironies of Automation, Automatica, 19, 775–779
Barshi, I. & Healy, A.F. 1978, Checklist Procedures and the Cost of Automaticity, Memory
and Cognition, 21, 496–505
Hale, A.R., Quist, B.W. & Stoop, J. 1988, Errors in Routine Driving Tasks: A Model and
Proposed Analysis Technique, Ergonomics, 31, 631–641
Hart, S.G. & Staveland, L.E. (1988). Development of NASA-TLX (Task Load Index):
Results of empirical and theoretical research. In P.A.Hancock & N.Meshkati (Eds.),
Human Mental Workload, (Elsevier Science, North-Holland) 139–183
Leplat, J. 1978, Factors Determining Work-load, Ergonomics, 21, 143–149
Liu, Y. & Wickens, C.D. 1994, Mental Workload and Cognitive Task Automaticity: An
Evaluation of Subjective and Time Estimation Metrics, Ergonomics, 37, 1843–1854
Nilsson, L. 1995, Safety Effects of Adaptive Cruise Controls in Critical Traffic Situations,
Proceedings of the Second World Congress on Intelligent Transport Systems, 3, 1254–1259
Norman, D.A. 1990, The ‘Problem’ with Automation: Inappropriate Feedback and
Interaction, not ‘Over-Automation’, Phil. Trans. R. Soc. Lond. B, 327, 585–593
Parasuraman, R. 1987, Human-Computer Monitoring, Human Factors, 29, 695–706
Reason, J.T. 1990, Human Error (Cambridge University Press, Cambridge)
Schlegel, R.E. 1993, Driver Mental Workload. In B.Peacock & W.Karwowski (Eds.),
Automotive Ergonomics, (Taylor & Francis, London) 359–382
Stanton, N.A. & Marsden, P. 1996, Drive-By-Wire Systems: Some Reflections on the Trend
to Automate the Driver Role, Safety Science, 24, 35–49
Stanton, N.A. & Young, M.S. (1997), Driven to distraction? Driving with automation,
Proceedings ofAutotech ’9, 77–86
Stanton, N.A., Young, M.S. & McCaulder, B. in press, Drive-By-Wire: The Case of Driver
Workload and Reclaiming Control with Adaptive Cruise Control, Safety Science
Young, M.S. & Stanton, N.A. in press, Automotive Automation: Investigating the Impact on
Driver Mental Workload, International Journal of Cognitive Ergonomics
THE USE OF AUTOMATIC SPEECH RECOGNITION
IN CARS: A HUMAN FACTORS REVIEW

Robert Graham

HUSAT Research Institute,


The Elms, Elms Grove,
Loughborough, Leics. LE11 1RG
tel: +44 1509 611088
email: r.graham@Lboro.ac.uk

Automatic speech recognition (ASR) has been successfully incorporated


into a variety of domains, but little attention has been given to in-car
applications. Advantages of speech input include a transfer of loading away
from the over-burdened visual-manual modality. However, the use of ASR in
cars faces the barriers of high levels of noise and driver mental workload.
This paper reviews some of the likely in-car applications of ASR, concluding
that its widespread adoption will be driven by the requirement for hands-free
operation of mobile phone and navigation functions. It then discusses some
of the human factors issues which are pertinent to the use of ASR in the in-
car environment, including dialogue and feedback design, and the effects of
the adverse environment on the speaker and speech recogniser.

Introduction
Automatic speech recognition (ASR) technology has been successfully incorporated into a
variety of application areas, from telephony to manufacturing, from office to aerospace.
However, so far, little attention has been given to in-car applications. This is perhaps
surprising given that one of the major advantages of speech input over manual input is that the
eyes and hands remain free. The task of safe driving could clearly benefit from a transfer of
loading from the over-burdened visual-manual modality to the auditory modality. Indeed,
numerous studies have confirmed the potential adverse safety impacts of operating a visual-
manual system (e.g. a mobile phone or car radio) while on the move. This situation is likely
to be exacerbated by the rapid growth of Intelligent Transportation Systems (ITS) such as
navigation or traffic information systems, which require complex interactions while driving.
As well as improving driving safety, ASR could increase the accessibility and acceptability of
in-car systems by simplifying the dialogues between the user and system, and the processes
of learning how to use the system.
The main difficulty facing the incorporation of speech into in-car systems comes from the
hostile environment. Noise (from the vehicle engine, road friction, passengers, car radio, etc.)
can adversely affect speaker and speech recognition performance. The car is also
characterised by a variety of concurrent tasks to be carried out while using speech
(particularly the primary task of safe driving), and varying levels of driver mental workload.
As well as these and other human factors issues, Van Compernolle (1997) suggests that the
442 R Graham

automotive industry is a slow acceptor of new technologies in general, and that there has been
confusion in the past about which speech applications to implement.
Despite these barriers, ASR may be useful for a number of different in-car applications,
each with particular requirements (in terms of vocabulary, dialogue, etc.) These applications
are outlined in the sections below. There then follows a general discussion of human factors
issues relevant to the use of ASR in cars.

Applications of ASR in Cars


Likely in-car applications for ASR can be put into 3 groups—standard vehicle functions
(including the stereo), phones and navigation/information systems. Van Compernolle (1997)
rates the importance of incorporating ASR into these functions as low, high and essential
respectively.

Standard Vehicle Functions


Any non-safety-critical vehicle control may benefit from the incorporation of speech
recognition, particularly those whose interfaces have multiple control options. A prime
candidate is the car stereo (radio/tape/CD). For example, Haeb-Umbach and Gamm (1995)
discuss a system in which speaker-independent continuous-speech is employed to access
various functions (e.g. “CD four, track five”), and speaker-dependent recognition allows
users to define their own names for radio stations (e.g. “change to BBC now”). Using speech
for the car stereo benefits from compatibility in input and output modalities; that is, the
auditory input of a speech command results in the auditory feedback of the change in radio or
CD output.
Other applications include the car’s climate control system, and the mirrors, windscreen
wipers, seats, etc. Although speech input to such basic car systems may not have significant
advantages over manual input for most users, it could allow drivers with physical disabilities
to use their arms and/or legs solely for the most important tasks of safe driving.

Phones
In recent months, a number of high-profile legal cases have argued the dangers of operating a
mobile phone while driving. These have led to adjustments in the Highway Code in the UK,
and the belief that legislation preventing the manual operation of phones on the move is
inevitable (Vanhoecke, 1997). Consequently, much effort is being invested towards the
development of voice-operated, hands-free kits, initially for keyword dialling and eventually
for all phone functions. The former application requires speaker-dependent recognition,
allowing the user to dial commonly-used numbers through keywords (e.g. “mum”, “office”).
The latter needs speaker-independent, continuous-word recognition for inputting numbers or
commands. The ASR capability may be either incorporated into the phone, accessed over the
mobile network, or pre-installed by the car manufacturer.

Navigation and Travel/Traffic Information


Technology such as navigation systems (which aid drivers in planning and finding their
destinations) or travel/traffic information systems (which inform drivers of local ‘events’
such as accidents, poor weather, services, etc.) have the potential to greatly increase the
complexity of driver-system interactions. Current systems require the driver to input
information, such as a journey destination, while on the move, and often use an array of
buttons or rotary switches to accomplish this. The near future is also likely to see a rise in the
prevalence of driver-requested services (for example, the ability to investigate the availability
of parking spaces in a town, or the location of the next petrol station), for which ASR could
be even more useful.
Fully voice-operated navigation requires that a user can input thousands of possible
geographical names to specify a destination. Apart from the obvious technical difficulties of
Automatic speech recognition in cars 443

large-vocabulary, phoneme-based recognition, there are added problems that the system must
cope with multi-national names, and poor pronunciation by users unfamiliar with the place
name they are speaking. One solution is to incorporate standard word recognition for
commonly-used names, with a fall-back to a spelling mode for less frequent inputs (Van
Compernolle, 1997). Of course, spelling itself is a complex process for a speech recogniser
due to the highly confusable ‘e-set’ (b, c, d, e, g, etc. all sound similar).

Human Factors Issues of ASR in Cars


Much has been written in the past about the human factors of ASR (see, for example, Hapeshi
and Jones, 1988; Baber, 1996). The following sections discuss those issues which are
particularly pertinent to the incorporation of ASR in cars.

User Population
It is generally accepted that a variety of user variables (age, gender, motivation, experience,
etc.) may affect the success of the speaker in operating an ASR system. The avionics industry
is probably the closest to the automotive industry in terms of the environmental demands on
speech recognition (concurrent tasks, noise, etc.); however, whereas aircraft pilots tend to be
highly-motivated, well-trained, younger and male, car drivers make up a heterogeneous
sample of the general public. Indeed, it should be noted that there are very few
successful public applications of ASR, probably due to the wide variation in speaking style,
vocabulary, etc.
Perhaps the most important factor is experience. Both the user’s experience with the
specific recognition system, and their experience with technology in general may affect
performance. Users who are computer literate may well adapt to speech systems more readily
than naive users (Hapeshi and Jones, 1988), but they may also have over-inflated expectations
of the technology. The implication for the design of in-car ASR is that both pre-use and on-
line training must be provided. For example, the system designed by Pouteau et al (1997)
allows the user to ask the system for assistance at any stage of the dialogue, to which the
system responds with its current state and allowable operations. This system also provides
automatic help if the user falls silent in the middle of a dialogue. As well as help for naive
users, the interface should adapt for expert users; for example, shortening the dialogues as the
user becomes familiar with them to avoid frustration.

Dialogue Initiation
Leiser (1993 p.277) notes that “an unusual feature of user interfaces to in-car devices is that
there will be a combination of user-initiated and system-initiated interaction. For example,
interaction with a car stereo will be largely user-initiated. A car telephone will demand a
roughly equal mixture…an engine monitoring system will be largely system-initiated”. Both
types of dialogue initiation must be carefully designed.
Because of the prevalence of sounds in the car which are not intended as inputs to the ASR
device (e.g. the radio or speech from/with passengers), user-initiated dialogues must involve
some active manipulation. One possibility is a ‘press-to-talk’ button mounted on or near the
steering wheel. However, this may result in one of the major potential advantages of voice
control over manual control (hands-free operation) being lost. An alternative is some keyword
to bring the system out of standby mode (e.g. “wake up!”, “attention!”, “system on!”)
In system-initiated dialogues, care must be taken not to disrupt the user’s primary task of
safe driving. Unless an intelligent dialogue management system which estimates the driver’s
spare attentional capacity is incorporated (see Michon, 1993), the system may request
information when the driver is unable to easily give it. Therefore, system prompts should be
designed to reassure the driver that a dialogue can be suspended and successfully taken up
again later (Leiser, 1993).
444 R Graham

Feedback
Feedback is any information provided by an ASR system to allow the user to determine
whether an utterance has been recognised correctly and/or whether the required action will be
carried out by the system (Hapeshi and Jones, 1988). As a general human factors principle,
some sort of feedback should always be provided, and it has been shown that this increases
system acceptance (Pouteau et al, 1997).
For many in-car applications, feedback will be implicit (‘primary feedback’); that is, the
action of the system (e.g. change of radio station, activation of windscreen wipers, phone
ringing tone) will directly inform the user what has been recognised. In these cases,
additional feedback from the speech system may not be required.
If explicit (‘secondary’) feedback is necessary for cases when system operation is not
obvious, or when the consequences of a misrecognition are particularly annoying, there are a
number of possibilities. A simple system of tones has been found to be efficient and well-
liked for certain ASR applications, but in the car environment there are likely to be a variety
of easily-confusable abstract tones present. Spoken feedback is transient and makes demands
on short-term memory (Hapeshi and Jones, 1988). It is also impossible to ignore, and may be
irritating for the user. Visual feedback via a simple text display has the advantage that it can
be scanned as and when required, but requires the eyes to be taken off the road. A
combination of spoken and visual modes may be preferable (Pouteau et al, 1997).

Effects of Noise
The failure of ASR devices to cope with the noisy car environment is probably the main
reason why in-car applications of speech input have been unsuccessful in the past. Noise can
adversely affect both speaker and speech recognition performance at a number of levels.
First, noise can impact directly on the recognition process by corrupting the input signal to
the recogniser. Second, speakers tend to sub-consciously adapt their vocal effort to the
intensity of the noise environment (the ‘Lombard Effect’), which then adversely affects
recognition accuracy. Third, noise can cause stress or fatigue in the speaker, which affects the
speech produced, which in turn affects the recognition accuracy. And so on. Noise can also
impact on cognitive processes outside speech production, which may affect the ability of the
user to carry out the required tasks concurrently.
ASR in the noisy car environment may be less problematic than other environments such
as offices or industrial settings, as it is more predictable. Although in-car noise comes from a
variety of sources (engine, tyres, wind, radio, passenger speech, etc.), the speed of the vehicle
can give reference points for reasonably effective noise reduction (Pouteau et al, 1997).
Technological solutions for coping with noise include selection of appropriate microphone
arrays, acoustic cancellation (especially from known sources such as the radio), and active
noise suppression through masking or spectral subtraction (Van Compernolle, 1997).
Poor accuracy associated with the Lombard Effect can be reduced by training the speech
recognition templates in a variety of representative noisy environments. However, because in-
car recording can be expensive, some success can be found by artificially degrading speech
with environmental noise (Van Compernolle, 1997). Also, for a given individual, the effects
of noise on speech production are relatively stable; therefore, speaker-dependent training
of particular template sets for ‘noisy speech’ may be an effective solution (Hapeshi and
Jones, 1988).

Effects of Workload and Stress


Speech has also been shown to be vulnerable to the effects of speaker workload or stress (e.g.
Graham and Baber, 1993). Sources of driver mental workload include (a) the driving task
itself (e.g. lane-keeping, speed choice, keeping a safe headway and distance from other
vehicles), (b) the driving environment (e.g. traffic density, poor weather, road geometry, etc.)
and (c) the use of in-vehicle systems (e.g. the presentation, amount and pacing of information
to be assimilated and remembered). For in-car applications of ASR, this implies that the
Automatic speech recognition in cars 445

speech recogniser may fail just when it is needed most; in high workload conditions where
the driver cannot attend to visual-manual controls and displays.
Similar to the strategy adopted to overcome noise, enrolment of speech templates under a
variety of representative task settings may reduce the effects of stress or workload on ASR
performance. As users tend to revert to speaking more-easily-recalled words under stress,
system vocabularies should be designed to be ‘habitable’. The size and complexity of the
vocabulary might also be reduced in stressful situations (Baber, 1996).

Conclusions
Despite the barriers, it seems very likely that ASR will be widely incorporated into cars in the
near future. Legislation relating to the use of mobile phones on the move and the rapid growth
of the ITS market will drive its adoption. However, little research has been directed towards
the use of ASR in the car environment. Further work is required into ASR dialogue design for
in-car applications, particularly with respect to the mode and timing of feedback while the
user is engaged in the concurrent driving task. Work is also required into the effects of the
particular sources of noise and mental workload found in the in-car environment on the
speaker and speech recogniser.

Acknowledgements
This work was carried out as part of the SPEECH IDEAS project, jointly funded by the
ESRC and the DETR under the UK Government’s LINK Inland Surface Transport
programme. For further details of the project, please contact the author.

References
Baber, C. 1996, Automatic speech recognition in adverse environments, Human Factors,
38(1), 142–155
Graham, R. and Baber, C. 1993, User stress in automatic speech recognition. In E.J. Lovesey
(ed.) Contemporary Ergonomics 1993, (Taylor & Francis, London), 463–468
Haeb-Umbach, R. and Gamm, S. 1995, Human factors of a voice-controlled car stereo. In
Proceedings of EuroSpeech ’95:4th European Conference on Speech Communication
and Technology, 1453–1456
Hapeshi, K. and Jones, D.M. 1988, The ergonomics of automatic speech recognition
interfaces. In D.J.Oborne (ed.) International Reviews of Ergonomics, (Taylor &
Francis, London), 251–290
Leiser, R. 1993, Driver-vehicle interface: dialogue design for voice input. In A.M. Parkes and
S.Franzen (eds.) Driving Future Vehicles, (Taylor & Francis, London), 275–293
Pouteau, X., Krahmer, E. and Landsbergen, J. 1997, Robust spoken dialogue management for
driver information systems. In Proceedings of EuroSpeech ’97: 5th European
Conference on Speech Communication and Technology, vol. 4, 2207–2210
Van Compernolle, D. 1997, Speech recognition in the car: from phone dialing to car
navigation. In Proceedings of EuroSpeech ’97:5th European Conference on Speech
Communication and Technology, vol. 5, 2431–2434
Vanhoecke, E. 1997, Hands on the wheel: use of voice control for non-driving tasks in the car,
Traffic Technology International, April/May '97, 85–87
INTEGRATION OF THE HMI FOR DRIVER SYSTEMS:
CLASSIFYING FUNCTIONALITY AND DIALOGUE

Tracy Ross

HUSAT Research Institute


The Elms, Elms Grove
Loughborough, Leics, LE11 1RG
Telephone: +44 1509 611088
email: t.ross@lboro.ac.uk

The implementation of advanced driver information and vehicle control systems


is becoming more widespread, with many systems already appearing on the
market. If information overload and reduced driver safety is to be avoided,
then consideration must be given to the integration of the HMI to such systems.
This paper reports on research which aims to provide human factors design
advice to vehicle manufacturers and system suppliers. The main focus of this
paper is the development of two classification systems: one for detailed system
functionality, the other a generic description of the inputs and outputs that
make up a dialogue. The way in which these will contribution to the development
of the design advice is described.

Introduction
It is widely accepted within the transport telematics community, that the piecemeal
introduction of advanced driver information and control systems into the vehicle is
undesirable, and raises concerns regarding the potential for information overload and human
error. An integrated systems approach is required to ensure that the design of the interface
(i.e. the input and output mechanisms) used to present information to the driver is soundly
based on human factors principles. To date there is a lack of research which analyses and
synthesises current ergonomics knowledge in this area, in order that it may provide practical
guidance for the integration of the interface to any combination of telematics systems in a
modular fashion.
An EPSRC funded project (INTEGRATE) is investigating this topic, with the ultimate
aim of developing design advice for vehicle manufacturers and system suppliers. This paper
reports on part of the state-of-the-art review conducted at the beginning of this project (Ross
et al, 1997). The whole review covered a range of topics relevant to in-vehicle HMI
integration, namely: the relevant cognitive psychology literature; human factors research into
HMI integration in both automotive and aerospace applications; a review of display and
control technologies of potential use in the vehicle; and a summary of relevant design
guidelines and standards existing and under development.
Integration of the HMI for driver systems 447

This paper focuses on two other areas of the review which are fundamental to the
development of design advice planned for later in the project. These are (a) the classification
of the detailed functionality of the systems considered for integration and (b) a generic
classification which can be applied to each individual input and output involved in the system
dialogue. In addition some practical examples of the design of an integrated in-vehicle system
are provided, based on earlier commercial work at HUSAT.

Development of the Classifications

Functional Classification
The literature was reviewed in order to discover any existing, well accepted classifications of
in-vehicle driver systems. It is necessary within INTEGRATE to develop such a classification
for three reasons: first, to clarify the scope of the project for those involved; second, to create
a list of system functions to discuss with manufacturers during the ‘Industry Requirements’,
the next stage of the project; and finally, to form a basis for a more detailed system
classification later in the project (i.e. at the advice stage, when concrete guidelines are likely
to be required for functional integration).
Research on functional classifications showed that the most detailed research has been
conducted in the U.S… In particular the University of Michigan Transportation Research
Institute (UMTRI) (Serafin et al, 1991), the Battelle Human Factors Transportation Center
(Lee at al, 1997)) and the University of Iowa (Mollenhauer et al, 1997) have conducted such
research from a human factors viewpoint (most classifications have been driven by the
technology). The most ‘official’ classification has been produced by the standards
organisation ISO (ISO, 1996). The decision of the project was thus to use the ISO
classification as a basis for the INTEGRATE approach, incorporating additional items from
the UMTRI, Battelle and University of Iowa work, and particularly, extending the list to
include conventional functionality, e.g. that of the audio system, speedometer, control pedals.
The classification produced will be adapted as necessary over the life of the project.

Due to space restrictions, only the main headings of the classification can be reproduced here.
In the full listing, each heading has up to ten sub-headings (Ross et al, 1997).

On-Trip Driver Information (e.g. prevailing traffic conditions)


Route Planning, Guidance & Navigation (e.g. dynamic route guidance)
Personal Information Services (e.g. filling station location)
Mobile Office Services (e.g. phone)
Entertainment/Comfort (e.g. radio)
Electronic Financial Transactions (e.g. road pricing)
Incident Management (e.g. emergency call)
Emergency Notification/Personal Security (e.g. automatic collision notification)
Longitudinal Collision Avoidance (e.g. intelligent cruise control systems)
Lateral Collision Avoidance (e.g. automatic steering/lane support)
Intelligent Junctions (e.g. clarification of right of way rules)
Vision Enhancement (e.g. enhancement night road scene)
Safety Readiness (e.g. driver alertness monitoring)
Vehicle Status/Warnings (e.g. oil level)
Mechanical Controls (e.g. steering wheel)
‘Secondary’ Controls/Displays (e.g. side mirrors)
448 T Ross

Classification of Inputs and Outputs


Previous commercial work at HUSAT on the topic of integration necessitated the
development of a classification system for a multi-function driver system. This system
included conventional driver tasks, advanced driver information and some elements of
vehicle control. The need for the classification of inputs and outputs arose from the
requirement to ‘code’ such a large number of interactions in a way that would aid in the
specification for an interface design to such a system. Due to the commercial nature of this
work, the classification system was not based on a thorough research review and as such,
relied solely on the human factors expertise of the staff. This current project was partly
inspired by this piece of work and thus, it was appropriate to review the literature, in
hindsight, to assess any other similar classification systems.
The subsequent search was disappointing. Little was found which described inputs and
outputs in sufficient detail to be of use to the project. There were some exceptions. In the
driving area specifically, the previously referenced work of Battelle and the University of
Iowa showed some activity in this area. In the wider human factors arena, there was little
which was of practical use for this project. Therefore, the classification produced here is
based on that developed during the commercial work at HUSAT. This classification has not
been validated in any way to date. It’s use within the project during the development of the
design advice, will go some way towards this validation.

Figure 1. The classification of inputs used by the INTEGRATE project


Integration of the HMI for driver systems 449

Figure 2. The classification of outputs used by the INTEGRATE project

Developing Design Advice


Classifying the inputs and outputs which make up the system dialogue is a first step towards
developing generic design advice for integrated systems. The past commercial work at
HUSAT on the feasibility of an integrated system, employed this approach with some
success. However, this was for a limited number of subsystems only. The approach to the
design was, for each system function, to classify each dialogue component (inputs and
outputs) according to the above system. This enabled the identification of the range of input/
output types to be incorporated in the design for that particular set of sub-systems. The next
stage was to list these types and, for each one, to identify the most appropriate input and
output method(s). An example for input is that for an action which falls into the category
‘user initiated, direct, move, vertical’, the appropriate input devices are up/down buttons or a
turn knob. As an example for output, for an information item which is ‘system generated,
direct, information, status’, e.g. traffic messages, it would be most suitable to use speech in
conjunction with text (possibly symbols if well known) together with an initial alert tone.
By this process, it was possible to identify both the minimum, and optimum set of input
and output devices that could/should be specified to achieve a usable integrated system. For
the system in question the design solution incorporated:

• A new, dedicated display area, showing icons of the various sub-systems (those that are
‘active’ are illuminated, the one currently in use is indicated)
• A flexible format display area in 3 sections: a user input bar at the top; a large centre
section where prompts, feedback and other system output will always appear; a soft key
label bar, with six associated keys.
• Three new flexible input devices: a quartet of vertical and horizontal ‘move’ keys,
an alphanumeric keypad and a pair of positive/negative keys (for yes/no, proceed/
cancel, etc.).
• New dedicated displays/controls as follows: steering wheel/stalk controls for frequent/
urgent functions; a parking aid display in the rear; a speaker for auditory output.
450 T Ross

• Dedicated displays/controls using existing equipment, e.g. set cruising speed associated
with the speedometer, activation of rear parking aid by engaging reverse gear.

This preliminary specification is obviously only the first step in the design of a usable and
safe system for use on the move. For example, no attempt was made to identify what should
be accessible whilst driving, as this can only be decided during the detailed design of a
system. However, it is a good starting point for the INTEGRATE project. What the project
hopes to do is to refine this first part of the approach and, more significantly, to develop
design advice in order to support the subsequent decisions that a design team has to make.

Industry Requirements
The next stage of the project is to identify the industry requirements for the format and
content of the design advice to be produced. Until this is complete the exact nature of the rest
of the research is unsure. It is hoped however, that use can be made of more advanced
technologies for conveying the design advice. For example, certain car companies are already
employing time saving design tools in other areas of vehicle design. Such software systems
endeavour to create an ‘environment’ within which the design team are free to explore
different implementation options. The software has in-built rules and knowledge which
restricts the design to that which is technically or legally possible as determined by current
knowledge. That is, the designer does not need to ‘know’ the rules and regulations in order to
design within them. The INTEGRATE project will investigate the possibilities of offering
human factors advice in this way.

Acknowledgements
The INTEGRATE Project is funded by the EPSRC Innovative Manufacturing Initiative, Land
Transport Programme, Telematics Call. Other partners in the project are Coventry University
Knowledge Based Engineering Centre, and the Motor Industry Research Association.

References
ISO 1996, Transport Information and Control Systems: Fundamental TICS Services,
Technical Report of ISO TC204/WG1 Architecture, Taxonomy and Terminology,
May 1996
Lee, J.D., Morgan, J., Wheeler, W.A., Hulse, M.C. and Dingus, T.A. 1997, Development of
Human Factors Guidelines for Advanced Traveler Information Systems (ATIS) and
Commercial Vehicle Operations (CVO): Description of ATIS/CVO Functions,
Publication No. FHWA-RD-95–201, (U.S. Department of Transportation, Federal
Highway Administration)
Mollenhauer, M.A., Hulse, M.C., Dingus, T.A., Jahns, S.K. and Carney, C. 1997, Design
decision aids and human factors guidelines for ATIS displays. In Y.I. Noy (ed.)
Ergonomics and Safety of Intelligent Driver Interfaces, (Lawrence Erlbaum, Mahwah,
New Jersey), 23–61
Ross, T., Burnett, G., Graham, R., May, A. and Ashby, M. 1997, State-of-the-Art Review:
Human Machine Interface Integration for Driver Systems, INTEGRATE Project,
Deliverable 1. EPSRC Innovative Manufacturing Initiative, Land Transport
Programme, Telematics
Serafin, C., Williams, M., Paelke, G. and Green, P. 1991, Functions and Features of Future
Driver Information Systems, Technical Report UMTRI-91–16, (University of
Michigan Transportation Research Institute)
SUBJECTIVE SYMPTOMS OF FATIGUE AMONG
COMMERCIAL DRIVERS

P.A.Desmond

Human Factors Research Laboratory


University of Minnesota
141 Mariucci Arena
1901 Fourth St. S.E, Minneapolis MN 55414
U.S.A

A study of real-life driving is reported in which the subjective


symptoms of fatigue were explored in commercial drivers. Drivers
completed subjective measures to assess fatigue, mood and
cognitive interference before and after their driving trip. A post-
drive measure of active coping was also administered. Prior to the
driving trip, drivers also completed the Fatigue Proneness scale of
the Driver Stress Inventory. The findings showed that subjective
fatigue was characterised by changes in mood and cognitive state.
Drivers not only experienced increased fatigue but also experienced
increased tension, depression, annoyance, and cognitive
interference. The study also showed that post-drive tension and
fatigue related to drivers’ reports of fatigue reactions to real driving,
as measured by the Fatigue Proneness scale. However, active coping
was unrelated to changes in most of the subjective state measures.

Introduction
Driver fatigue remains a significant problem for the commercial driving industry.
Many studies have been conducted to examine professional drivers’ performance
over prolonged periods of time (e.g. Mackie & Miller, 1978). However, as
McDonald (1984) points out, researchers have largely neglected drivers’ subjective
experience of fatigue. Desmond (1997) has stressed the importance of the
subjective component of fatigue in the light of transactional theories of the effects
of stressors on driver performance (Matthews, 1993). Transactional theories of
driver stress propose that stress reactions are the product of a complex dynamic
interaction between the individual and his or her environment such that stress
outcomes are the result of the driver’s appraisal of the demands of driving and
coping strategies. Since fatigue and stress share similar energetical properties (e.g.
Cameron, 1974), we can propose that the driver’s appraisals and coping strategies
also play an important role in driver fatigue. The present study attempted to explore
this possibility in a study of professional drivers’ subjective states.
452 PA Desmond

Recent simulated studies of driver fatigue (e.g. Desmond, 1997) have shown that
the subjective pattern of the fatigue state is a complex one. In Desmond’s studies,
drivers performed both a fatiguing drive, in which they performed a demanding
secondary attention task in addition to the primary task of driving, and a control
drive without a secondary task. Drivers completed a variety of subjective state
measures to assess mood, fatigue, motivation and cognitive interference before and
after the drives. The findings indicated that following the fatiguing drive, drivers
experienced increased subjective fatigue symptoms such as physical and perceptual
fatigue, as well as boredom, de-motivation and apathy. Drivers also experienced
increased tension, depression and cognitive interference indicating that the
fatiguing drive was mildly stressful. Thus, it is expected that fatigue will be
characterised by changes in mood and cognitive state in the present study.
Desmond also investigated the relationship between Fatigue Proneness, a
dimension of driver stress measured by the Driver Stress Inventory (DSI: Matthews,
Desmond, Joyner, Carcary & Gilliland, 1997), active coping, and a variety of state
measures in these studies. The findings indicated that Fatigue Proneness predicted
changes in state measures of fatigue, mood and cognitive interference. Moreover,
active coping predicted changes in the fatigue state measures, and also predicted
changes in mood and cognitive state measures. These studies provided support for
the utility of a transactional model of stress that incorporates a Fatigue Proneness
trait. The studies showed that Fatigue Proneness, like other DSI dimensions such as
Aggression and Dislike of Driving, relates to specific mood states and coping
strategies. In the present study, the Fatigue Proneness scale and an active coping
scale were used to predict changes in measures of fatigue, mood and cognitive state
in a sample of professional drivers. It was expected that both Fatigue Proneness and
active coping would predict changes in fatigue, mood and cognitive state measures.

Method
58 Australian professional truck drivers participated in the study. Drivers ranged in
age from 23 to 61 years (M=37.09). Time since obtaining an Australian truck
driver’s license ranged from 2 years to 38 years (M=15.08). The duration of driving
trips ranged from 6 hours to 18 hours and 50 minutes (M=11 hours and 38
minutes). All drivers completed measures of fatigue, mood and cognitive
components of subjective stress states before and after their driving trip. An 11-item
fatigue scale was used to measure fatigue. Mood was assessed with the UWIST
Mood Adjective Checklist (Matthews, Jones & Chamberlain, 1990). A shortened
version of the modified Cognitive Interference Questionnaire (Sarason, Sarason,
Keefe, Hayes & Sherarin, 1986) was used to assess intruding thoughts. The CIQ
requires subjects to rate the frequency with which they experienced specific
thoughts. The scale comprises 4 items relating to task-relevant interference and 6
items relating to task-irrelevant personal concerns such as personal worries, friends,
and past events. An unpublished post-drive measure of active coping was also
administered. This scale consists of 12 items concerning coping strategies that
relate to the driving task itself. Drivers also completed the DSI Fatigue Proneness
scale before their driving trip.
Subjective symptoms of fatigue among commercial drivers 453

Results
Table 1 gives means and standard deviations for the state measures. Two-tailed
paired-subjects t-tests were calculated for each state variable to measure the extent
to which the level of pre-drive state differs from the level of post-drive state (see
Table 1). The results of these analyses show significant increases in tense arousal,
depression, fatigue and task-relevant and task-irrelevant cognitive interference
following the drive. In addition, a significant decrease in energetic arousal was
found following the drive.

Table 1. Descriptive statistics for pre- and post-drive state measures

**p<.01, ***p<.001

Pearson correlations were calculated to measure the possible associations between


the Fatigue Proneness and active coping scales and state measures. Partial
correlations were also calculated in which each pre-drive state variable was
controlled in order to determine if the change in subjective state was related to the
Fatigue Proneness and active coping scales. Table 2 gives these correlations.

Table 2. Correlations between Fatigue Proneness, Active Coping and state


measures

*p<.05, **p<.01.
454 PA Desmond

The results show that Fatigue Proneness was positively associated with pre- and
post-drive tense arousal, fatigue and task-relevant and irrelevant cognitive
interference. Fatigue proneness was also negatively associated with pre- and post-
drive hedonic tone, and with post-drive energetic arousal. The partial correlations
indicate that the increase in tense arousal and fatigue during the drives was related
to the Fatigue Proneness scale. Pearson correlations between the active coping scale
and state measures shows that active coping is unrelated to the state measures. The
partial correlations show that the decrease in anger/frustration during the drive was
related to active coping while changes in the other state measures are unrelated to
active coping. Thus, in contrast to the findings of Desmond’s (1997) simulator
studies, it appears that the increase in fatigue found in the present study does not
relate to active effortful coping. There is a concern that the correlations shown in
Table 2 may be an artifact of the duration of driving trips that might, conceivably,
be confounded with both Fatigue Proneness and state measures. Thus, in order to
address this possibility, driving trip duration was correlated with the state measures,
Fatigue Proneness and active coping scales. The results showed that trip duration
was unrelated to Fatigue Proneness and state measures.
Table 3 gives the state change scores for the significant state changes found in the
present study and Desmond’s (1997) two simulator studies. The scales show consistency
in the direction but not in the magnitude of change. With the exception of energetic
arousal, the magnitude of change for the state measures is substantially larger in the
simulator studies than in the present field study. These results suggest that the simulator
drives elicit stronger and more stressful reactions than the real drives.

Table 3. Standardised state change scores for field and simulator studies

Discussion
The results of the study support the first hypothesis proposed. The first hypothesis
stated that subjective fatigue would be characterised by changes in mood and
cognitive state. The findings from the pattern of changes in the subjective state
measures provide support for this hypothesis. Drivers not only experienced
increased fatigue but also experienced increased tension, depression, annoyance
and task-relevant and task-irrelevant cognitive interference. Thus, this study has
Subjective symptoms of fatigue among commercial drivers 455

replicated the results from Desmond’s (1997) simulator studies, and provides
further evidence of the emotional and cognitive changes that characterise fatigue.
The second hypothesis stated that Fatigue Proneness and active coping would be
related to changes in mood and cognitive state. The results of the study provide
some support for the first part of this hypothesis. Fatigue Proneness was found to
relate to changes in tension and fatigue. This result is consistent with the findings of
Desmond’s simulator studies. However, active coping was unrelated to changes in
almost all of the subjective state measures. This latter result is inconsistent with the
findings from the simulator studies in which active coping was found to relate to
post-drive fatigue, mood, and cognitive state. This inconsistency may be explained
by the difference in the magnitude of subjective state change found in the present
study and the simulator studies. The results showed that state change for most of the
subjective measures was substantially larger in the simulator studies than in the
field study. Thus, it appears that the simulator drives were experienced as more
stressful by drivers than the real drives. The simulated environment represents a
novel situation for drivers and its unfamiliarity may serve to heighten stressful
reactions. An alternative explanation is that the determinants of active coping may
differ in simulated and real drives. Post-drive active coping was higher in the field
study, implying drivers may be generally more motivated to cope with stress in real-
life. In conclusion the present study supports the utility of a transactional model of
driver stress in accounting for the relationships between Fatigue Proneness,
cognitive processes and affective reactions in the real world context. However, the
role of coping in mediating the relationship between Fatigue Proneness and acute
fatigue reactions requires further investigation.

References
Cameron, C. 1974, A theory of fatigue. In A.T.Welford (Ed.), Man Under Stress,
(Taylor and Francis, London).
Desmond, P.A. 1997, Fatigue and stress in driving performance. Unpublished
doctoral thesis. University of Dundee, Scotland.
Mackie, R.R. & Miller, J.C. 1978, Effects of Hours of Service, Regularity of
Schedules and Cargo Loading on Truck and Bus Driver Fatigue. Santa
Barbara Research Park, Goleta, California.
Matthews, G. (1993), Cognitive processes in driver stress, in Proceedings of the
1993 International Congress of Health Psychology. Tokyo: ICHP.
Matthews, G., Jones, D.M. & Chamberlain, A.G. 1990, Refining the measurement
of mood: The UWIST Mood Adjective Checklist. British Journal of
Psychology, 81, 17–24.
Matthews, G., Desmond, P.A., Joyner, L.A., Carcary, W. & Gilliland, K. 1997, A
comprehensive questionnaire measure of driver stress and affect. In Traffic
and Transport Psychology. Amsterdam: Elsevier.
McDonald, N.J. 1984, Fatigue, safety and the truck driver, (Taylor and Francis).
Sarason, I.G., Sarason, B.R., Keefe, D.E., Hayes, B.E. & Sherarin, E.N. 1986,
Cognitive interference: Situational determinants and traitlike characteristics.
Journal of Personality and Social Psychology, 51, 215–226.
HOW DID I GET HERE?—DRIVING WITHOUT
ATTENTION MODE

J.L.May & A.G.Gale

Applied Vision Research Unit, University of Derby,


Mickleover, Derby, DE3 5GX, UK.
Tel\Fax 44 1332 622287 E-mail: AVRU@derby.ac.uk

Driving without attention mode (DWAM) is a state where the driver loses
awareness while driving. Evidence of this is when the driver suddenly realises
their location without being able to recall the actual process of having driven to
get there. DWAM is not only found in car drivers but may also be a causal
factor in some cases of SPaD’s (Signals Passed at Danger) experienced by
train drivers. The number of accidents which are a direct cause of DWAM is
unknown but clearly constitutes a serious hazard. This paper reviews research
in the area of DWAM and discusses its relevance to all vehicle drivers. Measures
introduced to help prevent DWAM are discussed.

Introduction
Many drivers recount some experience of “waking up” while driving and realised that they
have driven some distance without being able to recall exactly how they got there (Kerr,
1991). Driving is a highly visual task requiring constant monitoring of the road environment
to ensure that vehicle control is maintained. The presence of drowsiness or states of
inattention which might occur while the driver is travelling therefore may result in
inappropriate responses or no responses being made to a potentially hazardous situation.
Such actions are often attributed by the person concerned to inattentiveness, a lowered state
of awareness or fatigue leading to a failure to react adequately to the changes in the road
situation. There is evidence however to suggest that it is more than just a case of driver
fatigue (Furst, 1971) with drivers often observed to sit in the normal driving position gazing
straight ahead with a glassy stare. Williams (1963) attributed this cognitive state to trance
inducive features in the driving situation such as repetitive and monotonous stimulation,
particularly on highways. Drivers also often report that they were in a “trance like state”
rather than asleep. This has led to the term “highway hypnosis” being widely used to refer to
this state of inattention while driving. One of the earliest accounts of the term ‘road
hypnotism’ is by Brown (1921);
Driving without attention mode 457

“A large limousine was rolling north at 15 miles an hour. At the rear a similar vehicle
approached moving faster…. There was ample space for the second car to pass, but
to my astonishment it came up behind and crashed squarely into the first machine. It
was absurd. The second driver [a chauffeur] had sat at ease, his hands on the wheel,
his gaze straight ahead. There was nothing to divert his attention…. Asleep at the
wheel—sound asleep. The driver had been gazing at the bright streaming roadway
flowing smoothly beneath him. Its monotonous sameness concentrated his mental
faculties to the point of inducing momentary self hypnotism.“

This highlights some of the beliefs concerning highway hypnosis, namely; a hypnotic trance,
sleep inducive features of the road, fatigue, pre-occupation and a recognition of the condition
by professional drivers. The symptoms commonly detailed in the literature are; a trance like
state or glassy stare, late recognition of road hazards, a gradually developing steering bias or
auditory and visual hallucinations with the driver eventually even falling asleep. The
condition seems more likely to occur on familiar roads or featureless countryside under
conditions of monotonous travelling “where the lack of novelty promotes passive and
automatic responses“ (Williams and Shor, 1970). Hallucinatory experiences and poor
judgement are commonly reported by truck drivers (Wertheim, 1981).
This term ‘highway hypnosis’ although popular with the media is a barrier to scientific
inquiry into this problem (Brown, 1991). More recently this condition has therefore been
referred to as Driving Without Attention Mode (DWAM) (Kerr, 1991) being defined as a state
of inattention or loss of awareness to the driving task by the person controlling the vehicle.
Although mainly documented in car driving, DWAM has also been recognised as potentially
affecting train drivers and aircraft pilots, (e.g. Kerr, 1991). It becomes more prominent as people
control vehicles for prolonged periods. Similar phenomena have been shown to occur when
performing routine industrial operations. It is possible therefore that many repetitive tasks
performed under predictable and monotonous conditions could produce a similar cognitive state.

Theories of causes of driving without attention mode.


Only a few explanations have been offered of this phenomenon. Some of the theories do not
adequately explain its nature or origins and are very difficult to validate in experimental research.

a) Fatigue
McFarland and Moseley (1954) suggested that fatigue may be the most important factor in
DWAM. However while fatigue may facilitate the occurrence of this condition, drowsiness in
car driving has been shown to also occur when there is no evidence of excessive fatigue
(Roberts, 1971). Currently there is insufficient evidence to determine the exact relationship
between fatigue, sleepiness and DWAM although fatigue may be a contributory factor but not
necessary a causal one.

b) Hypnosis
Williams (1963) suggested that the monotony of the surroundings and the necessity to attend
to only a very small part of the visual field might induce some sort of hypnotic trance. There
is no direct evidence to support a relationship between hypnosis and DWAM but it is
recognised that the hypnotic state is affected by sleep deprivation.
458 JL May and AG Gale

c) Hyperinsulism
Roberts (1971) suggested that the occurrence of excessive drowsiness might be due to
functional hyperinsulinism—an over-sensitivity to a certain concentration of sugar in the
blood. This may lead a person (particularly narcoleptics, who suffer from unsuspected attacks
of sleepiness) to experience sudden attacks of lowered consciousness. However the
symptoms are too common within the population of drivers for this to be a general
explanation, (Wertheim, 1991).

d) Automation of Driving Task


The use of automatic or higher cognitive processing may cause some states of DWAM (c.f.
Reason, 1987). Vehicle control requires a mixture of controlled and automatic responses.
When the driving environment becomes more predictable less feedback is required and it is
such predictability which induces DWAM. For instance train drivers may experience false
expectations of signal aspects which could restrict their perception and assimilation of true
information (e.g. a signal may be interpreted as orange when it is in fact red). The probability
of making such errors increases with task proficiency. Buck (1963) attributed possible causes
of some SPaD’s to factors such as a driver incorrectly estimating his location or totally losing
his position on the track and missing a signal or selecting the wrong signal. Drivers have to
learn the route before they are allowed to drive it and thus a state of automatic processing or
inattention may occur when driving on a well learned route.

e) Oculomotor Control and Alpha activity


Wertheim (1991) proposed that DWAM is associated with a lessening of reliance upon
attentive stimulus information and a move towards internally governed oculomotor control,
based on internal representations. This is a shift from actively reacting to changes in the
environment to monitoring an unchanging mental representation of the predictable
conditions.

f) Monotony
Monotonous tasks can depress the levels of performance and arousal due to lessened sensory
stimulation. DWAM may develop by continuously looking at the same objects in the visual
field moving in a predictable pattern. A monotonous road situation, however, does not always
imply a predictable one (e.g. when driving in thick fog).

Problems of DWAM

Quantification
Few official accident records have been kept detailing the occurrence of DWAM and
therefore it is very difficult to gain an understanding of the size of the problem. Retrospective
interviewing of drivers is problematic.

Definition
Public awareness of DWAM and ambiguity about it and other potential states, such as
drowsiness, may lead to under-reporting. For example some train drivers may have a
tendency to drowse whilst driving (Endo and Kogi, 1975).
Driving without attention mode 459

Experimental work
DWAM is a far reaching problem and existing experimental work may not have addressed all
the contributing factors or considered all their interactions. For instance in addition to the
above, the effect of common drugs such as alcohol, caffeine and nicotine need to be
determined. The role of diurnal variations and biological rhythms may also be important,
particularly for pilots who frequently cross time zones.

Possible Solutions

Devices
A variety of devices have been developed which monitor the physiological alertness of the
body and activate a warning when the driver becomes drowsy. For instance: registering the
pulse; blood pressure; muscular reflex; or eye lid reflex have been studied. Two examples
highlighted by Roberts (1971) are the ‘Buzz Bonnet’ and the ‘Autoveil device’. The response
to devices such as these however can occur so late that their value is greatly diminished. It is
also unlikely that one single device universally addresses the problem of DWAM, and it is
important to look at each individual task and each user’s limitations and capabilities.
Wilde and Stinton (1983) found that certain types of vigilance devices for train drivers
were not linked with direct control of the train. These could in fact divert the driver’s
attention away from driving the train to the task of cancelling the warning and thus failing to
focus the driver’s attention. Vigilance devices should direct the drivers attention to some
specific train driving task such as speed control. The Advanced Warning System (AWS) was
designed to warn train drivers of signal aspects and possibly it may not be effective in the
long term. It is feasible that due to the frequency with which the driver has to cancel the AWS
they may learn to respond automatically. Cancelling the AWS therefore may become
somewhat ineffective against attention loss. Also while such vigilance devices and safety
systems have been introduced it may be many years before such systems are implemented
widely. The problem of SPaDs must therefore still be addressed.

Steps the driver can take.


There are various measures a driver can take if experiencing DWAM, such as taking a break
or listening to a radio. The ability of the driver to take such steps will depend on;

• their recognition of symptoms and awareness that they are not attending to the task. If the
features of the road do have some hypnotic effect however then the driver may not be
aware of it.
• Their recognition and understanding of DWAM. They may think that once they have
recognised the symptoms of DWAM then they will be able to keep themselves awake and
so continue driving.
• Their willingness and ability to perform such steps. For example a driver may be under a
time pressure to reach a destination.

Driver Education
Many drivers have an unawareness and under—recognition of the occurrence of DWAM and
what measures to take to reduce its occurrence. It is important that the driver recognises the
460 JL May and AG Gale

symptoms of DWAM so that he/she can make an appropriate judgement on their physical and
mental state to drive.

Roadway Engineering
This needs to determine ways in which novelty and variability can be introduced into the
driver’s task and environment. Such measures include the introduction of minor curves on
long straight stretches of road, different types of road surfaces producing changes in noise
and vibration, and rumble strips placed at the side of the road.

Conclusion
Research into DWAM needs to bring together all transportation areas and not just concentrate
on car driving. Theoretical explanations tend now to emphasise higher levels of learning and
automation, in conjunction with the predictability of the external vehicle scene. Further
research is needed to look at the role of all possible contributory factors. Current devices to
counteract DWAM may not be fully appropriate in addressing the problem. Human factors
research is needed, not only in defining the problem and its components, but also in assessing
the suitability of devices and roadway engineering to address the problem and assess the long
term benefits.

References
Brown W. 1921, Literary Digest, June 4th , 69, 56–57.
Brown I. 1991, Highway Hypnosis: Implications for Road Traffic Researchers and
Practitioners, In A.G.Gale et al. (eds.) Vision In Vehicles III (Elsevier Science
Publishers B.V. North Holland).
Buck, L 1963, Errors in the Perception of Railway Signals, Ergonomics 11 (6).
Endo, T. & Kogi, K. 1975, Monotony Effects of the Work of Motormen during high speed
train operation. Journal of Human Ergology 4, 129–140.
Furst C. 1971, Automizing of Visual Attention. Perception and Psychophysics, 10: 65–69
Kerr J.,S. 1991, Driving without attention mode (DWAM): A normalisation of inattentive
states in driving. In Vision in Vehicle III (op.cit.)
McFarland R.A. & Moseley A.L. 1954, Human Factors in Highway Transport Safety.
(Boston, Harvard School of Public Health).
Reason J.T. 1987, The cognitive basis of predictable human error In Megaw E.D. (Eds)
Contemporary Ergonomics, (Taylor and Francis, London).
Roberts H.J. 1971, The Causes Ecology and Prevention of Traffic Accidents, (Charles
C.Thomas Publisher, Springfield, Illinois, USA).
Wertheim A.H. 1981, Occipital alpha activity as a measure of retinal involvement in
oculomotor control. Psychophysiology 18, 4:432–439
Wertheim A.H. 1991, Highway Hypnosis: A Theoretical Analysis, In Vision In Vehicles III
(op.cit.)
Wilde, G.J.S. & Stinson, J.F. 1983, The Monitoring of Vigilance in Locomotive Engineers.
International Journal of Accident Analysis and Prevention 15 (2) 87–93.
Williams G.W. & Shor R.E. 1970, An Historical Note on Highway Hypnosis Accident
Analysis and Prevention, 223–225.
Williams G.W. 1963, Highway Hypnosis: A Hypothesis. International Journal of Clinical and
Experimental Hypnosis, 103:143–151
SENIORS’ DRIVING STYLE AND OVERTAKING:
IS THERE A “COMFORTABLE TRAFFIC HOLE”?

Tay Wilson

Psychology Department, Laurentian University


Ramsey Lake Road
Sudbury, Ontario, Canada, P3E 2C6
tel (705) 675–1151

“Conventional wisdom” held by many on both sides of the Atlantic is that


driving near the speed limit results in being overtaken by a continuous
stampede of drivers resulting in significant trip delay, thus necessitating
driving with ever more speed to avoid being overtaken. This conventional
wisdom is controverted by an on-road study on the British A1, in which a
relatively “comfortable traffic hole” was discovered consisting of driving in
the inside lane where-ever possible at a speed just above the legal speed of
lorries (60m.p.h.). Actual trip time delay caused by involuntary slowing for
traffic events was found to be of the order of only 10 minutes over journeys
of about five hours. Trip aspects relevant to mental load and risk and
subjective time estimate implications are discussed.

“Conventional wisdom” held by many on both sides of the Atlantic is that driving near the
speed limit results in being overtaken by a continuous stampede of drivers and thus, by being
trapped in the inside lane behind slow traffic, incurring significant trip time delay. Extending
this reasoning leads to a sort of unstable positive feedback loop in which drivers adopt driving
styles involving traveling at ever higher speeds to avoid being overtaken. This tactic presents
real risk, not least to older and some other categories of drivers who are not inclined towards
high speed driving. In a series of three earlier on-road studies of overtaking (Wilson and Neff,
1995; Wilson, 1996; Wilson, 1997a, b. see above) actual patterns of overtaking on various
Canadian roads have been examined which belie much of this “wisdom”. On the basis of this
earlier work, it is hypothesized here that there exists, on the British A1, relatively a
“comfortable traffic hole” which older and other drivers can use to reduce apparent mental
work load and risk. This “comfortable traffic hole” consists of driving in the slow lane just
faster than the posted speed limit for lorries (60m.p.h.). Furthermore it is hypothesized that
contrary to subjective estimates by many (Wilson and Ng’andu, 1994) of “lost” time due to
this driving style strategy, little actual trip time will be lost.
462 T Wilson

Method
The test driver was a senior citizen age 77 with a valid driver’s licence, a half century of
driving experience, training as a multi-vehicle military driver and 40 years experience of
driving the A1 several times a year. The car was a 1997 Volkswagen Polo with only a few
hundred miles on the odometer. Two normal trips along the A1 were chosen. The northbound
trip began at 2:29p.m. on April 17, 1997 at the Stoke Rochford entrance to the A1 and
terminated at 7:28p.m. at Berwick. The southbound trip began at 3:30p.m. at the Felton
entrance to the A1 and ended at 8:20p.m. at the turn-off from the A1 to Cambridge. The driver
was instructed to drive, as normal, at a target speed of 60 to 65m.p.h.—just faster than the
speed limit for lorries on the road (60m.p.h.) and considerably slower than the posted speed
limit for cars (70m.p.h). The target speed dictated that the test driver would spend most of his
time in the inside or slow lane. Each trip included a rest lasting about forty five minutes. The
experimenter recorded on-going speed, overtaking events, involuntary slowing and other
salient events. Since the experimenter had ridden as passenger with the test driver many times
and had had many conversations about driving during this time, it is likely that observations
by the experimenter caused minimal disturbance to normal driving.

Results
In table 1 can be found a tabulation of cars and lorries overtaking and being overtaken by the
test driver over 30 minute trip segments for the northbound trip (trip 1) and the south-bound
trip on the A1 in Britain. (Rests of 40 and 55 minutes were taken during the two trips.). On
the northbound trip, about 15 times as many cars (438) overtook the test driver as were
overtaken by him (28) while he overtook about ten times as many lorries (106) as overtook
him (10). On the southbound trip, about nine times (370) as many cars overtook the test driver
as were overtaken by him (43) while he again overtook about ten times as many lorries (126)
as overtook him (12). (χ2, p<0.005 for all four comparisons.) No significant mean or variance
differences between the north bound and the south bound trip were found for car overtaking,
for lorry overtaking, for being overtaken by cars or for being overtaken by lorries. Thus one is
unable to reject the null hypothesis that there was no difference in the overall overtaking
experience on the two trips—and hence that they were just two typical A1 trips.
In table 2 are the occasions—with salient comments—during which the test driver was
forced to slow down below his preferred cruising speed or for round-abouts by time of day,
duration in minutes (all fractions of minutes were rounded up to the next whole minute),
overtaking experience during the slow-down (overtaking or being overtaken by car and
overtaking or being overtaken by lorries) and finally by enforced speed slow-down.
Consider involuntary slowing to accommodate traffic conditions in Table 2. In the two
trips, only one minute of enforced slowing below the test driver’s preferred cruising speed
was caused by a slower car ahead, six minutes by traffic merging into lanes of the A1, and
sixteen minutes below Newcastle by slower trucks while 28 minutes was caused by slower
trucks on the one-lane section of the road below Berwick. Seven minutes of enforced slowing
was due to seeing brake lights ahead for various reasons including traffic build-up. Ten
and eleven minutes of enforced slowing were due to round-abouts and warning signs
respectively.
Seniors’ driving style and overtaking 463

A clear road with no visible traffic ahead was experienced for six minutes on each trip.
The test driver spent only three minutes in the fast lane on each of the trips beyond immediate
overtaking—return manouevres, both times while traveling slowly in heavy traffic.

Discussion
The data here support the existence of a ‘comfortable traffic hole’ or a within which seniors
and others can normally drive relatively comfortably without undue delay and stressful event.
From the time of involuntary driving at 60, 55, 50, and 45 m.p.h. respectively, it can be
determined that there was “lost” because of involuntary reductions in speed below the 65
m.p.h. target speed, a total of twenty four miles over ten hours of trip time, or, in round terms,
about 10 minutes extra driving time over each of the two trips. (About 25% of this time
(Newcastle in trip 1 and construction in trip 2) would have been difficult to “save” by any
amount of more aggressive driving because all traffic lanes were clogged and moving at the
same speed. In conclusion there was no meaningful trip delay caused by the driving strategy
of the test driver. (Wilson and Ng’Andu (1994) found evidence suggesting that high accident
drivers relatively report subjective trip time to be higher than actual time—low accident
drivers report no difference.)
Second, consider the events. Over two trips on the A1, beyond the short time delays
described above, the test driver experienced no occasion in which he was not in his preferred,
comfortable, and “legal and safe” position on the road. Only on one occasion (for a merging
bus) did the test driver brake to slow down in a manner perceptible to the experimenter—and
that was best described as light braking. The conclusion is that there does appear to be a
driving style on the A1 suitable for many older and other drivers which provides a relatively
lower risk and lower mental load journey with out losing significant trip time—namely
driving just faster than the speed limit of lorries (60 m.p.h) in the inside lane for as much of
the journey as possible. (Note that driving at this speed results in being overtaken about once
per minute by cars and twice per hour by lorries.

References
Wilson, Tay and Ng’Andu, Bwalya., 1994. Trip time estimation errors for drivers classified
by accident and experience. In S.A.Robertson (ed.) Contemporary Ergonomics
(Praeger, London), 217–222.
Wilson, Tay and Neff, Charlotte, 1995. Vehicle overtaking in the clear-out phase after
overturned lorry has closed a highway. In S.A.Robertson (ed.) Contemporary
Ergonomics (Praeger, London), 299–303.
Wilson, Tay, 1996. Normal traffic flow usage of purpose built overtaking lanes: A technique
for assessing need for highway four-laning. In S.A.Robertson (ed.) Contemporary
Ergonomics (Praeger, London), 329–333.
Wilson, Tay. 1997. Overtaking on the Trans-Canada: conventional wisdom revised. In
S.A.Robertson (ed.) Contemporary Ergonomics (Praeger, London), 104–109.
Wilson, Tay, 1997. Improving drivers skill: can cross cultural data help? In Don Harris (ed.)
Engineering Psychology and Cognitive Ergonomics (Ashgate, Aldershot), 1, 395–401.
464 T Wilson

Table 1. A1 Car and Lorry Overtaking by 30 min. segments when driving at


a 60–65 mph target speed.
Seniors’ driving style and overtaking 465

Table 2. Involuntary slowing on the A1 to 60, 55, 50, 45 mph and for roundabouts
by time, duration in minutes (Dur.) and car (C) or lorry (L) and overtaking event.
SPEED LIMITATION AND DRIVER BEHAVIOUR

Di Haigney1 and Ray.G.Taylor2

1 Road Safety Dept. 2 Applied Psychology Division


RoSPA Aston University
353 Bristol Road Aston Triangle
Birmingham B5 7ST Birmingham B4 7ET

Engineering models of driver behaviour suggest that if more stringent


physical limits were enforced on speeding behaviour, a significant decrease
in the frequency of speeding related injury accidents would occur. Driver
behaviour was tested under various speed limitation conditions on the Aston
Driving Simulators. Participants also completed a questionnaire testing for
affective response, attention and awareness of task performance per
limitation condition. Differential effects of limitation were noted in driving
behaviour, as well as the frequency of accident types. The overall mean
accident frequency did not vary significantly across conditions.

Introduction
ECMT (1984) indicates that a reduction in average speed throughout Europe of about one
kilometre per hour could save seven percent of fatal accidents on the roads per year. Surveys
carried out by the U.K. Department of Transport (DoT, 1994) involving over nine million
vehicles however, confirm a widespread disregard for speed limits, with some 60 percent of
drivers exceeding the posted speed limit on motorways.
Given this apparent reluctance of the public to obey posted limits, speed reduction and the
concomitant reduction in injury accidents predicted above could be achieved directly through
the installation of speed limiting devices in vehicles. Enforced speed limitation may not
necessarily result in the safety benefits of the magnitude cited above, as ECMT (1984) safety
benefit calculations are based solely upon a ‘non-interacting engineering’ model of driver
behaviour.
In brief, this model assumes that any changes to accident risk caused by ar intervention
will be passively accepted by the road user population—such that an improvement in safety
can be predicted using an engineering calculation (Adams 1985). Other patterns of behaviour
are thought to remain relatively unaffected with ‘knock on’ behavioural effects being of
negligible magnitude.
The ‘non-interacting’ assumption of the engineering approach has beer challenged by
a number of studies which have found very little direct correspondence between the
predicted safety benefits calculated and the actual recorded changes in traffic accidents
(Evans, 1985).
Speed limitation and driver behaviour 467

The weak relation between engineering calculations and actual safety benefit may be
attributed in part to poor data collection and analysis practice (Haigney, 1995), although a
number of researchers also maintain that the low association is an indication that behavioural
‘compensation’ occurs. That is, the safety benefit arising from an engineered safety
intervention is exploited by drivers to allow some change in performance associated with
greater positive utility (e.g. increased speed), which effectively reduces the safety benefit
realised through the intervention (Wilde 1982).
This study attempts to assess the responses of drivers to the introduction of enforced
maximum speed restrictions of varying levels of stringency.

Method
The Aston Driving Simulator (ADS) is a fixed base, closed loop simulator in which a
participant is able to interact fully with a computer generated environment via the
manipulation of a steering wheel, brake and accelerator pedals, spatially arranged so as to
mimic the operation of an automatic vehicle.
The ADS registers and stores data on the participants manipulation of these controls and of
the users performance in the simulated environment (e.g. position in the carriageway, collisions
with other vehicles, collisions with the edge of the carriageway) each half second. A monitor
generates a view of a single lane carriageway ‘populated’ with other simulated car drivers who
travel at a steady thirty miles an hour on both sides of the road. These simulated ‘others’ are
able to interact intelligently with the user through overtaking when appropriate to do so.
Prior to either a practice run on the ADS or to data collection, all participants were read
standardised instructions outlining the experimental procedure and conditions. Each
participant was told that they had been allotted fifty ‘points’ which would be reduced by a
certain amount according to the ‘severity’ of collision experienced on the ADS namely:
Head-on crash=25 points lost; Rear of car in front=10 points lost; Veering off the road=5
points lost. Following a collision, the simulated vehicle would be centred back onto the
lefthand carriageway and the participants would be required to pull away again from a
stationary position.
Participants were also informed that subjects with points remaining at the end of all
experimental runs would receive this number of points in pounds sterling. Participants who had
continued to collide with objects after having lost all points were to pay the fifty points and the
excess amount in pounds sterling to the experimenter. At this point, participants either chose to
sign a slip to agree to these conditions or did not participate further with the experiment.
After having been read these instructions and having agreed to them, participants were
allotted a ten minute ‘practice’ run on the driving simulators in order to become familiar with
the working of the ADS. No speed limitation or other form of performance restriction was
experienced in the practice run.
Following the practice run, participants knowingly entered the experimental driving
conditions through responding to on-screen prompts via a keyboard. The four experimental
conditions comprised restrictions on the maximum speed possible throughout each ‘run’ of
30mph, 45mph, 70mph and 120mph (the latter being in effect, a free speed condition).
Participants experienced each of the four experimental driving conditions in randomised
order. The speed limitation for each condition was displayed on the monitor to participants
prior to each ‘run’. Each participant experienced the same ‘track dynamics’ under each
limitation condition.
After completing the four experimental sessions on the ADS, participants completed a
Driver Attitude Questionnaire (DAQ), developed in part from a driving attitude questionnaire
devised by Parker et al (1994) and in part from a questionnaire devised by Hoyes(1990). The
DAQ assesses perceived risk per condition, awareness of changes in driving behaviour per
condition and the participants utility of speed.
Subjects were paid in pounds sterling for any ‘points’ remaining after the experimental
run. Participants who had expended their allocation of points and who had an ‘excess’ to be
468 D Haigney and RG Taylor

paid to experimenter, were informed that they would be invoiced after all participants in the
experiment had been run. In reality and delayed in order to maintain the credibility of the
mechanism, these subjects were informed that no monies were required to be paid after all
subjects had completed the experimental run on the ADS.

Results
Forty participants—twenty male, twenty female—with mean age 27.44 (SD 8.56) were
tested on the Aston Driving Simulator (ADS). All participants had full, current driving
licenses with mean driving experience at 8 years (SD 8.94; minimum 1 year; maximum 34
years) and mean exposure at 7600 miles per annum (SD 1449.49).
Repeated measures ANOVAS were calculated for all the dependent variables provided
by the ADS. Correlations between ADS variables and responses to the DAQ, were
calculated where appropriate. Data logged through the practice runs was not included in the
analyses.

Speed Variables
Mean speed was not found to differ significantly when tested by the gender of the participants
(DF=1; f=0.390; p<0.539), age (DF=1; f=1.058; p<0.175), driving experience (DF=2;
f=0.202; p<0.819) or driving exposure (DF=1; f=0.646; p<0.431).
Mean speed was found to be associated with a significant difference between speed
limitation conditions (DF=3; f=300.5855; p<0.001). As maximum speed capability became
increasingly restricted, mean speed rose as a percentage of this capability. In the 30mph
condition for example, subjects mean speed represented 95.64% of capability, whereas mean
speed accounted for 83.03% of capability at 45mph, 77.38% at 70mph and 48.62% in the
120mph condition. Participants also demonstrated reduced variation in speed from the mean
value in the more restricted conditions, with SD for mean speeds ranging from 2.1260 in the
30mph condition, 6.5791 in the 45mph condition, 10.9756 in the 70mph condition and
14.2820 in the 120mph condition.
An effect on speed variance resulting from collisions across conditions was assessed and
found to be nonsignificant for each type of collision recorded (collision with vehicle in front
[DF=2; f=0.356; p<0.702]; Collision with verge [DF=9; f=1.472; p<0.160]; head on collision
[DF=1; f=0.303; p<0.584]).

Acceleration Variables
Calculations of mean acceleration data from the ADS, refer to the angular positioning of the
accelerator pedal, in effect the pressure applied to the pedal.
Significant differences were not recorded across any of the demographic variables: sex
(DF=1; f=0.026; p<.873), age (DF=1; f=0.086; p<0.772), experience (DF=2; f=0.386;
p<0.685), exposure (DF=1; f=0.088; p<0.770).
Mean acceleration was found to vary significantly across conditions, decreasing as speed
limits increased (f=16.409; df=3; p<0.00).
Acceleration variance was found to vary significantly across conditions (f=35.2376; df=3;
p<0.001), increasing as speed limits increased—possibly indicating a greater frequency of
overtaking or attempted overtaking manoeuvres (refer to ‘Lateral position variables’ below).
Statistically significant differences across conditions were not detected when the effect of
collisions on accelerator variance was controlled (collision with vehicle in front [DF=2;
f=0.496; p<0.612]; Collision with verge [DF=9; f=1.060; p<0.394]; head on collision [DF=1;
f=0.487; p<0.487]).

Braking Variables
Mean braking is determined through a numeric system relating to the degree of pedal travel
indicating the pressure being applied to this control.
Speed limitation and driver behaviour 469

No demographic variable exhibited a significant difference in mean braking values: sex


(DF=1; f=0.545; p<0.469), age (DF=1; f=0.156; p<0.697), experience of driving (DF=2;
f=0.228; p<0.798), exposure to driving (DF=1; f=0.258; p<0.546).
Mean braking was found to vary significantly across conditions, increasing as speed limits
increased (df=3; f=31.9623; p<0.001).

Lateral Position Variables


Position variables represent the distance between the ‘participant’s vehicle’ and the centre
line of the ‘road’.
Lateral position was not found to vary significantly across the demographic variables of:
sex (DF=1; f=1.663; p<0.211), age (DF=1; f=0.045; p<0.834), experience of driving (DF=2;
f=0.369; p<0.696), exposure to driving (DF=1; f=1.385; p<0.092).
The mean position of the ‘vehicle’ was found to vary significantly across conditions
(df=3; f=14.3198; p<0.001), with mean metres from the centre of the carriageway increasing
as speed limits increased.
Mean position was found to be significantly correlated with mean speed (r=0.63;
p<0.001), speed variance (r=0.74; p<0.001), and acceleration variance (r=0.50; p<0.01).
Position variance was tested across demographic variables and was also found to be
nonsignificant across each: sex (DF=1; f=0.342; p<0.565), age (DF=1; f=1.495; p<0.235),
experience of driving (DF=2; f=1.256; p<0.306), exposure to driving (DF=1; f=0.376;
p<0.546), although position variance was found to vary significantly across conditions
(DF=3; f=11.611; p<0.000).

Accident Variables
Head on collisions were tested against demographic variables and were found to be
nonsignificant by sex (DF=1; f=1.950; p<0.177), age (DF=1; f=1.852; p<0 .188), experience
(DF=2; f=0.580; p<0.569), exposure (DF=1; f=0.002; p<0.969)
Collisions with the vehicle in front were also found to be nonsignificant across sex
(DF=1; f=0.461; p<0.504), age(1, 2) (DF=1; f=0.190; p<0.667), experience (1, 3) (DF=2;
f=0.088; p<0.916), exposure (DF=1; f=0.633; p<0.435).
The mean frequency of both car-car collision types increased significantly with less
restrictive limitation (head-on collision [DF=3; f=11.343; p<0.000], collision with the
vehicle in front [DF=3; f=10.245; p<0.000]).
Collisions with the verge were not significant across demographic variables: sex (DF=1;
f=.389; p<0.540), age (DF=1; f=2.113; p<0.161), experience (DF=2; f=1.006; p<0.383),
exposure (DF=1; f=.420; p<0.524) although this collision type was significant across
condition (DF=3; f=4.365; p<0.005). Mean accident frequency overall did not evidence a
significant difference across conditions (DF=3; f=3.328; p<0.996).

Driver Attitude Questionnaire (DAQ)


44% of the sample indicated that most accidents occurred to them in the 70mph condition,
although the greatest overall accident frequency actually occurred in the 45mph condition. If
only car-car collisions are considered, then the 120mph condition had the greatest total
number of accidents and 40% of participants indicated that they felt at increased risk, in this
condition.
Participants indicated their awareness of differential acceleration, braking and positioning
across the conditions, in line with ADS data. The majority of participants (61%) reported
paying increasing levels of attention as speed limits were increased, with participants also
tending to rate the higher speed limit conditions as being more enjoyable (56%).
Although participants acknowledged that speeding was one of the main causes of
accidents (76%), participants exhibited very relaxed attitudes towards speeding offenders—
considering them to be unlucky to have been caught (87%), most agreeing that it was
acceptable to drive faster than a posted speed limit (73%), possibly reflecting the majority
470 D Haigney and RG Taylor

opinion that speed limits were set too low (72%). Most participants agreed that they
compensated for road conditions which are perceived to be safe by driving faster (81%).

Discussion
The decreased frequency of severe accidents in the more highly restricted conditions, would
agree with the ‘non-interactive’ engineering approach, although the lack of significance
across conditions exhibited by the mean frequency of accidents overall, especially when
considered against the significant shifts in driver performance across conditions, could be
held to indicate some process of compensation.
Drivers exhibited an increase in ‘risk-acceptance behaviours’ (Wagenaar and Reason,
1990) via a tendency to ‘floor’ the accelerator pedal in the more highly restricted
conditions—to the point where mean speeds were at 95.64% of capability. This appeared to
be linked to frustration and boredom expressed by the subjects on the DAQ for these
limitation conditions. Increases in ‘safety behaviours’ (Matthews et al., 1991) were only
demonstrated in the less highly restricted conditions.
Drivers subjective perception of risk appeared to be associated most closely with car-car
collision events, rather than total accident frequency and increased perception of risk was also
associated with increased attention and enjoyment of the the task all of which rose with speed
capability.
In summary, data suggest that speed limitation may prove an effective means through
which to reduce the likelihood of severe accidents, although not all accident typologies will
be affected equally. It should also be noted, that the majority of participants showed evidence
of risk taking behaviour in order to alleviate boredom and frustration arising from the more
restrictive limitation conditions. Given that subjects also indicated they regarded ‘speeding’
as acceptable, since posted limits were set ‘too low’, it may be that any feasibility study
examining the introduction of speed limitation technology, should also consider the rationale
underlying the limits established per carriageway in order to counter behaviours noted above.

References
Adams, J.G.U. 1985, Risk and Freedom—The record of Road Safety Regulations (Bottesford
Press, Nottingham, U.K.)
Department of Transport 1994, Vehicle speeds in Great Britain, 1993, Department of
Transport Statistics Bulletin (94) 30, U.K.
European Conference of Ministers of Transport 1984, Costs and benefits of road safety
measures (ECMT: Paris)
Evans, L. 1985, Human behaviour feedback and traffic safety, Human Factors, 27, 555–576.
Haigney, D.E. 1995, The reliability and validity of data, In Roads 17(2), 11–17.
Hoyes, T.W. 1990, Risk Homeostasis, Master of Science Thesis (Hull University, UK)
Matthews, G., Dorn, L.; Glendon, A.I. 1991, Personality correlates of driver stress.
Personality and Individual Differences 12, 535–549.
Nilsson,G. 1982, The effect of speed limits on traffic accidents in Sweden, VTI Report No.68,
S-58101, p.1–10, 1982 (National Road and Traffic Research Institute, Linkoping,
Sweden)
Parker, D., Reason, J.T., Manstead, A.S.R., Stradling, S.G. 1994, Driving Errors, Driving
Violations and Accident Involvement (Driver Behaviour Research Unit. Department of
Psychology, University of Manchester, U.K.)
Wagenaar, A.C., Reason, J.T. 1990, Types and tokens in road accident causation. Ergonomics
33(10–11), 1365–1375.
Wilde, G.L.S. 1982, The Theory of Risk Homeostasis: implications for safety and health,
Risk Analysis 2, 209–25.
THE ERGONOMICS IMPLICATIONS OF CONVENTIONAL
SALOON CAR CABINS ON POLICE DRIVERS

S.M.Lomas* C.M.Haslegrave

Institute for Occupational Ergonomics Institute for Occupational Ergonomics


University of Nottingham University of Nottingham
Nottingham, UK Nottingham, UK

* Now at Applied Vision Research Unit, University of Derby, Derby, UK

To minimise cost Police Forces have to use vehicle fleets consisting of relatively
conventional cars. However the way in which police officers use these vehicles
differs greatly from the general public for which the cars are primarily designed.
Due to the nature of their job officers often spend a considerable amount of
time sat in the vehicle and there is often a need to enter and exit quickly and
easily. A further complication arises from several officers sharing one car with
each requiring a unique driving position in order to drive in a comfortable and
safe environment. Problems also arise as a result of the equipment officers are
required to carry and increasingly by the wearing of protective vests and body
armour. This paper presents the results obtained from a study on police drivers
and their interaction with the car cabin.

Introduction
People in general are spending more time in their cars as a result of several factors such as
increases in commuter traffic, business related journeys and traffic congestion. Thus the
ergonomics of the car cabin becomes increasingly important to enable the driver to perform
their work safely and efficiently with minimum effort and discomfort.
In recent years the number of possible adjustments to the car seating and controls has
significantly increased. The potential driving position adjustment of the car has been shown
by Gyi and Porter (1995), to be a significant factor in the amount of reported discomfort. Cars
which permit increased freedom of movement and improved postures by having adjustable
features, result in less people being absent from work.
In many respects police officers are an extreme case for a vehicle ergonomist. Not only do
they spend a considerable amount of their job sat in their police car, either driving or
performing other activities but they also tend to be significantly larger then the average driver.
The nature of their job and the high average mileage increases the possibility that a police
driver may already have suffered injury though a previous incident and this may compound
the problem.
472 SM Lomas and CM Haslegrave

The aim of this study was to investigate the problems of using conventional vehicles as
police cars. In order to understand this it was necessary to study the anthropometric
characteristics of the police force population, understand the various tasks that police cars
perform and obtain the comments and opinions of police officers.

Methods
Initially meetings were held with the fleet managers and representatives from a number of
police forces to gain an understanding of how best to access the population, obtain background
information on the officers, the cars they used and the roles and tasks they perform. Following
this a questionnaire was developed to elicit information from individual officers.
The questionnaire was distributed to police stations which had a mixture of Traffic,
Section and Traffic Support Officers. This provided information from officers who had
different duties (see Table 1), drove on different road types and drove different car types.

Table 1. Classification of Police Officer Duties

To provide more detail than would have been possible with a questionnaire 25% of
questionnaire respondents participated in a follow-up interview. The interview was used to
obtain anthropometric data, determine the driving position and collect the officers’ opinions.
As such the interviews were carried out in the officers’ own police car.
The results from both the questionnaire, interview and case-study were wide ranging
therefore only some of the major findings are summarised in this paper.

Results
The questionnaire was mainly completed by Traffic Police, as may be seen from Table 2. This
reflects the composition of officers at the large police stations targeted.

Table 2. Summary of Questionnaire Respondents

Stature
The interviewed police officers were found to be of a considerably larger stature compared to
the general population as may be seen in Table 3.
Ergonomics implications of saloon car cabins on police drivers 473

Table 3. Stature Percentiles

Discomfort and Driving


46.4% of Officers reported that they experienced discomfort when driving or sat in the
driving seat of their police car. Table 4 shows where these problems were reported.

Table 4. Region of Officers Discomfort

Duration of Driving
Officers spend a considerable amount of time driving or sat in the police car. 75.9% of
officers drive for at least 20 hours per week, whilst 81.3% drive/sit in the driver’s seat for at
least 20 hours per week.

Adjustment of the Driving Position


54.4% of officers share a car with greater than 5 other officers. Officers commented that
adjustment needed to be quick and easy to use, durable and repeatable. A small number of
officers pointed out that entry to a vehicle that had previously been used by a small driver was
often difficult.

Head Restraints
Most of the cars in the study had head restraints which were able to be adjusted in height and
tilt. Of the ones that were fixed, officers commented that they were positioned at an incorrect
height and were uncomfortable. Officers did not like me lack of control over them. However
in the interviews the vast majority of officers did not adjust the head restraint or check its
position.

Seat Belt Height Anchorage


The majority of the cars had manual adjustment of the seat belt height anchorage adjustment.
One make of car adjusted automatically the location of the seat belt depending on the position
of the person. General comments regarding officers’ opinion of this type of adjustment
resulted in a varied response. Some liked the fact that they were not required to remember to
adjust it, others disliked the lack of control and the position that the seat belt adopted. The
wearing of protective vests resulted in officers being unable to feel the position of the seat
474 SM Lomas and CM Haslegrave

belt. In some cases this was viewed as an advantage as belts were reported as being
uncomfortable. In other instances the body armour impeded exit from the car due to officers
not being able to sense that they were wearing a seat belt.

Steering Wheel
The level of adjustment of the steering wheel was reported as being dissatisfactory by 8% of
drivers. Problems were reported relating to obscured controls and displays, impeded entry
and exit and difficulty in regaining a preferred position, due to lack of fixed détentes.

Seat Height
25.9% of officers were dissatisfied with the seat height. An inappropriate seat height resulted
in officers either hitting their head on the car roof or exerting an excessive amount of effort
when exiting the car. A number of officers expressed significant irritation at this as 43.2%
enter their police car greater than 15 times per day.
A significant difference between civilian and police car derivatives is that the latter are not
fitted with sunroofs. This results in police cars having notably more headroom, partially
offsetting the relative increase in occupant size. Despite this some officers still experienced
difficulties with the amount of available head clearance. A particular problem was identified
by drivers hitting their head on the roof whilst travelling over speed ramps at high speed.

Lumbar Adjustment
27.8% of respondents believed that inadequate lumbar support was linked to their discomfort.
Of those that drove vehicles which had lumbar support adjustment 30.4% were still
dissatisfied. All of the lumbar adjustments in the cars surveyed adjusted in the horizontal axis
only. During the interview a number of officers stated that some form of vertical adjustment
would be needed to provide an adequate level of support. Additionally several officers
complained that wearing body armour rendered seat lumbar support ineffective.

Seat Back Inclination Wear and Tear


The majority of officers (87.5%) were satisfied with the seat back inclination. However a
repeated comment related to concern over the durability of the vehicle interior and in
particular the degradation of the support offered by the seat back due to the accelerated wear
experienced by police cars as a result of the frequent exits, entry and adjustment.

Use of Equipment Belts


The most significant problem experienced by police officers in vehicles appears to be related
to the wearing of equipment belts. Items carried on the belt, in particular the baton, frequently
get caught in the seat causing wear and tear, hindering exit and impairing the position of the
seat belt across the hips. Removal of the belt whilst driving is not considered viable as it
would place unacceptable restrictions on the officers ability to perform their duty.
Ergonomics implications of saloon car cabins on police drivers 475

Use of Body Armour


Two types of protective vest were generally worn by a number of officers. An overt vest
(worn over the shirt and jumper), which was standard issue by each force, and a covert vest
(worn under the clothing), which was purchased by individual officers. Officers reported
problems with the overt vest being too restrictive, riding up the neck and digging into the mid
back when sat. Additionally officers reported difficulties encountered when entering and
exiting the vehicle.
In a case study to investigate the implications of wearing body armour on an individual
officer, it was found that the overt vest significantly increased the chest depth (96th to 99th
percentile) and abdominal depth (85th to 99th percentile) of the wearer.

Discussion and Conclusions


Driving for greater than 20 hours per week whilst at work has been reported as resulting in
increased sickness absence (Gyi and Porter, 1995). Additionally several studies have
indicated that driving is a causal factor in the prevalence of lower back pain (e.g. Porter et al,
1992). A report in Police Review (25th July, 1997) highlighted that Traffic officers are
particularly vulnerable to back pain.
The issue of police officers banging their heads on the roof of the vehicle when driving
over speed ramps is a serious concern, not only in terms of the health effects on the officers
but also in relation to the possible momentary loss of concentration and control.
Due to the large number of drivers who use each car adjustments are required which are
quick and easy to use, repeatable and durable. The inclusion of fixed détentes in the
adjustment of the steering wheel for example, would ease the repeatability of driver position.
Most officers prefer to have the ability to adjust elements of the driving position or safety
restraint system. However the interview showed that in the case of head restraints and seat
belts the driver very rarely made any adjustment. When used selectively, i.e. for the seat belt
anchorage height, automatic adjustment systems could have significant benefits in terms of
optimum positioning and speed of adjustment.
It is important that the integrity of the interior is maintained throughout the vehicle’s life.
In particular, the seat in a police car must be robust due to the frequency of entry and exit and
wear and tear that protective equipment and clothing places on it.

References
Gyi, D.E. & Porter, J,M. 1995, Musculoskeletal troubles and driving: A survey of the
British Public, In Robertson, S,A (ed.) 1995, Contemporary Ergonomics, (Taylor &
Francis), London, 304–308
Police Review, 1997, 25th July, Providing back support, Police Review
Porter, J.M., Porter, C..S. & Lee, V.J.A., 1992, A survey of car driver discomfort. In
Lovesey, E.J. (ed.), Contemporary Ergonomics, 262–267, Taylor & Francis, London
THE DESIGN OF SEAT BELTS FOR TRACTORS

D H O’Neill*
B J Robinson**

*Silsoe Research Institute, Silsoe, Bedford


**Transport Research Laboratory, Crowthorne, Berks

The use of seat belts on tractor cabs is negligible. Although very few tractors
are fitted with seat belts, there is some evidence that the use of seat belts in
tractors fitted with cabs would save lives. The requirements of seat belts for
on-road and off-road use are reviewed and the apparently conflicting needs,
which may exert a strong influence on their acceptability and use, outlined.
Farmer attitudes towards the use of seat belts are discussed and original
experimental data from a study of seat belt use, addressing farmers’
comments, are presented.

Introduction
The legislation governing the seat belt use has not, so far, demanded the use of seat belts on
tractors. Nevertheless, some tractors, usually the more expensive models, are sold with seat
belts fitted and there are (International) Standards covering their design, particularly
concerning anchorage points1. The main purpose of wearing a seat belt is to prevent the driver
from being ejected from the seat. This applies specifically to tractors with ROPS (Roll-Over
Protective Structures), usually in the form of cabs, rather than to “open” tractors, as one
effective way of avoiding injury when a cabless tractor overturns. However, this is more
common off-road than on-road.

A recent study of on-road accidents involving agricultural vehicles (Robinson, 1994)


concluded that seat belts could save 15 to 20 fatalities and serious injuries per year. However,
to be effective on-road, where collisions with other (large) vehicles are the most serious threat
to safety, a 3-point (ie diagonal) design would be preferred. This presents a conflict with off-
road use where tractor drivers consider such a design to be too restricting and a 2-point (ie lap
belt) design would be preferred.

1 ISO 3776 (1994). Tractors for agriculture—seat belt anchorages.


Design of seat belts for tractors 477

The subject of this paper is a research project, commissioned by the Department of Transport2
and undertaken by the Transport Research Laboratory and Silsoe Research Institute to investigate
the use of seat belts for on-road use, with a view to reducing on-road fatalities.

Attitudes of suppliers and drivers


The corporate opinion of the suppliers was that “we fulfil our responsibility by providing anchor
points and stock seat belts which are provided as an option”. Except for JCB “Fastracs”, these
are invariably lap belts. The difficulty of fitting a diagonal belt to a suspended tractor seat was
frequently raised: this would entail all three anchorage points moving with the seat.

Two small samples of farmers, from areas of contrasting farming systems, were interviewed.
Collectively, the farmers had little experience of wearing seat belts on tractors but, because
the owners of JCB Fastracs were specifically targeted two farmers from Cheshire and three
farmers from Scotland (around Perth and Inverness) had had the opportunity to use seat belts.
Their attitudes are summarised in Table 1.

Table 1. Tractor drivers’ attitudes to on-road wearing of seat-belts

The solitary driver who does wear a belt drives a FASTRAC and stated that, to do so, he would
have to be on a relatively long (roughly an hour or more) and high speed (over 40 mph) journey.
Two criticisms were made by a majority of the farmers (who had little or no experience of
seat belt use on tractors).

• Seat belts would be too restrictive on body movement;


• It would be a nuisance to keep unfastening and fastening when they dismount and mount
so frequently. This is regarded as a source of unnecessary delay.

Half the farmers said they spent little time on the road and that the combined exposure and risk
of an injury was too small to justify wearing a lap belt. However, half said they would wear
them off-road on “steep” slopes. Some of the farmers, particularly those with a generally positive
attitude towards wearing seat belts, felt that a lap belt could be more hazardous than no belt, as
injuries suffered by use of the belt could be worse than injuries without it.

A minority of farmers commented that seat belts:

• would get dirty, would get in the way and so become dangerous;
• were acceptable, although their use should not be made compulsory (it should be a matter
of individual judgement).

2 now Department of the Environment, Transport and the Regions


478 DH O’Neill and BJ Robinson

Some comments were made on how driver behaviour might change causing tractors to be driven
more dangerously if the wearing of a seat belt offered the driver a “false sense of security”. This
risk compensation argument has been used against many safety improvements and usually has
been found to be false, but it remains conjectural and does emphasise the importance of drivers’
perceptions of risk, danger and injury. In this respect training could play a major role. In fact, a
number of farmers suggested that public money would be spent more effectively on training
drivers to drive safely than on addressing the consequences of dangerous driving.

Design and fitting of seat belts


The particular difficulty with fitting and using seat belts in agricultural vehicles is the low
frequency, high amplitude movement of the seat. It is almost inevitable that, unless seat belts
are designed with this in mind to ensure that they are comfortable and do not unduly hinder
the driver’s operations, wearing rates will remain very low.

From the design point of view, fitting seat belts, especially diagonal ones, to suspended seats
is a major problem because of the additional strength requirement for the seat and/or floor.
Changes in seat design to accommodate a shoulder-level anchorage point are unlikely to find
favour with drivers as a high seat back would hinder many off-road operations. It is apparent
that drivers remove head restraints, when they are fitted, because of the need for rearward
monitoring for many off-road activities. To avoid the need for a high seat back, a shoulder-
level anchorage point, independent of the seat could be fitted but this would be difficult to
implement because of locating a suitable anchorage point (that would not be able to move
with the seat). From the driver’s point of view, wearing a diagonal belt would be too
restrictive on body movement off-road, although if an accommodating design could be
specified, our findings suggested that farmers (drivers) would be more amenable to using seat
belts. However, the majority felt that their use should not be mandatory.

There are two main variants of lap belts: those with loose straps which tend to fasten
centrally, and those with one retractable strap which fastens into a fixed mechanism. These
factors, together with the design of the buckle or fastening mechanism, can influence user
acceptability. The retractable may or may not have emergency locking (eg inertia reel),
thereby allowing or not the length of the belt to change as the driver makes small movements,
but preventing large or sudden movements.

Laboratory tests
To provide information on what real practical difficulties might be involved, particularly
regarding nuisance and delays, a short series of laboratory tests was carried out. The tests
made use of Silsoe Research Institute’s three-axis vibration rig. This is effectively an
immobile tractor, mounted on three hydraulic rams. These rams are controlled by a computer
in such a way that the vibrations produced are a realistic simulation of those experienced on a
real tractor. Test subjects can sit on the rig and perform various tasks which are also
representative of real conditions.

For the purposes of this project, four subjects of varying shapes and sizes were asked to
complete various tasks. Each trial simulated on-road driving and lasted approximately 25
minutes, during which each subject had to get off and on the rig five times. Between these
Design of seat belts for tractors 479

events, other simulated driving tasks, typical of road use, were undertaken. These tasks
included both stretching forward and looking to the rear of the tractor. Each trial was
recorded on video tape which facilitated subsequent work-study analysis. The subjects
completed a questionnaire after each trial to give their own comments on the use of seat belts.
Observing the behaviour of the subjects also provided some insight into the nuisance factors.

The four subjects were all experienced tractor drivers and ranged from 55kg to 106kg in
weight, from 1.65m to 1.80m in height, and 330mm to 455mm in hip breadth. Details are
given in Table 2; subject 3 was female. All testing was carried out with a KAB U2 seat with
XH2 suspension unit. Four seat belt conditions were used by each subject:

i) no belt fitted
ii) standard (static) lap belt fitted
iii) retractable lap belt fitted
iv) retractable lap belt and arm-rests fitted to seat.

Table 2 Relevant anthropometric data for the subjects

Laboratory findings
Objective data were obtained by studying the video recordings to determine the times needed
for unfastening and fastening the seat belts (not for condition i) and the times taken to
dismount and remount the rig between successive spells of simulated driving (see figure 1).
These have been interpreted to give the following objective information.

• Seat belts increased the time taken to get off and on the tractor (3-axis vibration rig) by
roughly 7.5 s (p<0.01).
• The time required to manipulate (fasten and unfasten) the seat belts (all conditions
combined) was approximately 9 s.

Figure 1. Dismounting and remounting times


480 DH O’Neill and BJ Robinson

• The retractable belts were unfastened more quickly than the standard belts (p<0.01) but
there were no significant but there were no significant differences in fastening times.

From the questionnaires the following subjective information was obtained.

• The retractable belts were preferred by 3 of the 4 subjects. No subject preferred the
standard belt.
• The subject who did not express a preference was the quickest at manipulating the
standard belt, but differences between subjects for manipulating belts were not significant.

The researchers’ observations are summarised below.

• The quicker unfastening of the retractable belts is attributed to the one part being fixed and
the other not needing to be placed anywhere.
• The slower fastening of the retractable belts is attributed to occasional seizures of the
inertia reel.
• The subjects’ overall preference for the retractable belt is attributed to their considering
the comfort during use to be more important than occasional difficulties in fastening.
• The effect of the arm rests is unexpected in that the belts (retractable only) were
manipulated more quickly in this condition. This could be due to the fact that the arm-rest
trial was always conducted last for each subject and, hence, the practice gained and the
increased understanding of the characteristics of the inertia reel enabled the subjects to
use the retractable belt more quickly.

Conclusions
Retractable lap belts are the most acceptable form of tractor occupant restraint system. Test
subjects found them to be comfortable, not to hinder them in a variety of simulated driving
tasks, and to add less than 8 seconds to a typical dismount-remount operation.
The small selection of farmers questioned were not averse to wearing seat belts when they
judge them to be appropriate but strongly disagree with their compulsory use. Driver
education and persuasion might be the most effective methods to maximise wearing rates.

Acknowledgement
This research was funded by the UK Department of Transport, 1995–96.

Reference
Robinson, B.J. (1994). Fatal accidents involving “other motor vehicles”, 1991–92. TRL
Project Report PR/VE/100/94. Crowthorne: Transport Research Laboratory.
(Unpublished report available on direct personal application only).
NOISE AND
VIBRATION
AUDITORY DISTRACTION IN THE WORKPLACE:
A REVIEW OF THE IMPLICATIONS FROM
LABORATORY STUDIES.

Simon Banbury1 and Dylan Jones2

1 Air Human Factors, Defence Evaluation and Research Agency,


Farnborough, UK.
2 School of Psychology, University of Wales, Cardiff.

In the decades between 1950 and 1980 research on the behavioural effects of
noise emphasised the role of the intensity of the sound, searching for a threshold
at which white noise would impair performance. Latterly, emphasis has shifted
to the distracting effects of sound, particularly the effects of speech on cognitive
processing, experiments which have produced distinctly different conclusions.
Findings in this area may have important implications for an understanding of
efficiency at the workplace. The support for three key claims is reviewed: (a)
that the degree of distraction is unrelated to the intensity of the sound; (b) that
speech and non-speech sounds are functionally equivalent in their effect; and
(c) that the sound must be segmented into a sequence of events that are different
in timbre or pitch for it to be disruptive. The implications of these results for
practical settings are discussed.

Introduction
The effects of background sounds on task performance are of great relevance to the study of
efficiency at the place of work, whether the work is undertaken in the office or on the flight
deck of an aircraft. Whilst the effects of loud white noise on cognitive processing have been
generally very inconsistent, the deleterious effects of extraneous speech on cognitive
processing are more consistent and, arguably, more relevant to the modern workplace. A
number of laboratory studies have shown large, consistent and replicable disruption of
performance, effects which appear to have relevance for the workplace.
The first report of what became known as the “irrelevant speech effect” was by Colle and
Welsh (1976). They established the classic paradigm that has been used with minor variation
since: typically, experimental subjects are asked to retain and report in correct serial order a
sequence of verbal items (usually consonants) from a visually-presented sequence. After
seeing the list the subjects are asked to rehearse the items for a short interval and then to
report them in serial order when prompted. For some of the lists, background sound is played,
but in these cases the subjects are expressly asked to ignore any sound they hear.
Auditory distraction in the workplace 483

Early studies established key features of this disruption by irrelevant speech, among them
that the degree of disruption was not dependent on the meaning of the sound (a conclusion
supported in much subsequent work: Salamé & Baddeley, 1982; Jones, Miles & Page, 1990;
Banbury and Berry, 1997); nor on its intensity (at least within a limited range). However,
recent research has found that the effect is not confined to speech, since the effect can be
found with tones (Jones & Macken, 1993) or pitch glides (Jones, Macken & Murray, 1993).
Thus, it is now referred to it as the irrelevant sound effect.
These laboratory findings have a number of important implications for an understanding
of efficiency at the workplace. Thus, this paper reviews the following key features of the
“irrelevant sound effect”: ➀ that the degree of distraction is unrelated to the intensity of the
sound; ➁ that speech and non-speech sounds are functionally equivalent in their effect; and ➂
that the sound must be segmented into a sequence of events that are different in timbre or
pitch for it to be disruptive.

Intensity and meaning have no effect


The effects of white noise on cognitive processing have been generally very inconsistent,
showing improvements in performance, reductions in performance or no effects at all (see
Jones, 1990 for a review). However, the occurrence of white noise is relatively rare in most
workplaces, so this review will concentrate on the effects of more common background
sounds, such as speech (e.g. from co-workers), and tones (e.g. from alarms and auditory
feedback in the operation of equipment). Indeed, there is little evidence to suggest that the
effects of white noise are similar to those of speech. Most notably, the effect of irrelevant
speech on serial recall is independent of intensity; the disruption was roughly the same
whether the sound level is equivalent to a whisper [48dB(A)] or a shout [76dB(A)]. This is
true whether the level is varied between trials or within trials (Tremblay and Jones, 1998).
That the meaning of sound is an important determinant of disruption, although intuitively
plausible, has received little experimental support. A number of studies, using serial recall
tasks, have shown that speech in a language a person does not understand leads to disruption
as (Colle and Welsh, 1976; Colle, 1980; Banbury and Berry, in press); and that the effects is
roughly the same as for narrative English for English speakers (Jones, Miles and Page, 1990).
However, a recent study by LeCompte, Neely and Wilson (in press) contests the assumption
that the disruption observed on serial recall tasks is independent of meaning. They found
some tentative evidence that meaningful speech disrupts serial recall more than meaningless
speech. However, their results merely indicated a trend in their data, and failed to reach
conventional significance levels.
Studies on the effect of meaning on more complex cognitive tasks have been inconsistent.
Reading tasks, such as text comprehension, are susceptible to the meaning of background
speech (Martin, Wogalter and Forlano, 1988; Jones, Miles and Page, 1990). However,
Banbury and Berry (1997, in press) have shown that the disruption on mental arithmetic and
memory for prose tasks by background speech was not mediated by its meaning.
Nevertheless, for seriation-based tasks (i.e. those that require that the order of information to
be maintained correctly) the weight of evidence suggests that the disruption is independent of
intensity and independent of its meaning.
484 S Banbury and D Jones

Speech and non-speech are equipotent


Salamé and Baddeley (1982) suggest a model to account for the disruption by background
sounds. They assume that in the process of reading, material in written form is transformed
into phonological code, a code which is based on the sound of the material rather than its
appearance or meaning. This set of codes conflicts in memory with phonological codes
resulting from privileged access of speech to phonological memory. They suggest a
mechanism in which the degree of disruption is proportional to the phonological similarity of
items from two sources. Thus only speech, they argue, can show an irrelevant speech effect.
Evidence for this assertion is based on a serial recall task in which a sequences of single digit
numbers were used as to-be-remembered materials. Irrelevant auditory materials with
different degrees of similarity to the to-be-remembered sequences were compared:
semantically-similar words (integers) and phonologically-similar words (tun—one, gnu—
two, tee—three, sore—four, etc.) gave marked but comparable degrees of disruption, while
the effect of phonologically-dissimilar words (tennis, jelly, tipple, etc.) was significantly less,
but still appreciably worse than a quiet control (Salamé and Baddeley 1982, Experiment 5).
The closer the resemblance of heard and rehearsed materials, it was argued, the more likely
was the degree of disruption.
However, their model based on the phonological similarity between the two streams
remains controversial. Effects may be found with non-verbal memory and also with non-
verbal irrelevant sounds. Studies by Jones, Farrand, Stuart and Morris (1995) using spatial
memory tasks with no verbal component, and Jones and Macken (1993) using random tones,
suggest that some factor other than phonological confusion is responsible for the disruptive
effects observed. Furthermore, a study by Jones, Macken and Murray (1993) found that
random pitch glides interspersed with short periods of silence could also produce similar
disruptive effects to that of speech.
Recent research by LeCompte, Neely and Wilson (in press), however, contests Jones and
Macken’s findings that the disruption from tones and syllables (i.e. speech and non-speech
sounds) is equipotent. An exact replication of their paradigm, albeit with twice the amount of
participants, found that speech disrupted serial recall to a greater degree than did tones.
However, Jones and Macken’s (1993) “equipotentiality hypothesis” has been supported by
Banbury and Berry (in press, 1997), who generally found that speech and the noise from
office equipment (i.e. consisting of telephone, printer and fax noise, but no speech) produced
similar levels of disruption to memory for prose and mental arithmetic tasks.
Overall, these results suggest that tones and speech are equipotent, which confounds any
account based on the similarity of the visual and auditory material. Instead, some simple
analysis of the auditory stream, insensitive to the gross acoustical differences between speech
and steady-state tones, serves as the basis for disruption. This simple analysis underpins the
“Changing State hypothesis” put forward by Jones, Madden and Miles (1992) to account for
these results.

“Auditory changing state” is a necessary condition


The Changing State hypothesis originally formulated by Jones, Madden and Miles (1992)
argues that the “irrelevant” sound stream has to show an appreciable acoustic variation (in all
Auditory distraction in the workplace 485

but intensity) from one segmented entity to the next, rather than the assumption that the sound
has to be “speech-like” before it disrupts serial recall. A necessary precursor to changing state
must be some means of segmenting this sometimes physically continuous signal into its
component units. If the onset of a sound is masked, such as in a speech babble, the effects of
disruption is rather small. Sounds that do not contain sharp transitions in energy, such as
continuous pitch-glides, therefore show reduced levels of disruption. Once a sound is
segmented the important determinant of disruption is the degree of stimulus mismatch
between successive stimuli. However, this effect is non-monotonic. As the difference in
acoustic properties of successive items in the series increases (such as one from a single voice
or single instrument) there is initially an increase in disruption; however, beyond the point at
which the differences become so great that each member of the sequence constitutes a distinct
perceptual object (such as a different voice, or different instrument) the degree of disruption
begins to diminish. That is, physical change increases disruption as long as the identity of the
‘object’ producing the sound remains throughout the series (see Jones, Alford, Bridges,
Tremblay & Macken, 1997 for a discussion).

Implications for the workplace


The practical implications for the deleterious effects of extraneous sounds are fairly clear.
Extraneous sound is increasingly common in a range of work environments, such as open-
plan offices, aircraft cockpits and various kinds of command and control centres. If irrelevant
speech impairs performance on tasks involving primary memory, then job performance may
be affected adversely.
A small number of studies have attempted to research the effects of background noise in
open-plan office environments (e.g. Banbury and Berry, 1997; Banbury and Berry, in press).
Consistent with may observational studies conducted after the introduction of open-plan
offices in the 1960s, these results highlight people’s susceptibility to disruption from
extraneous background noise, even when they are not attending to the noise. Banbury and
Berry’s (1997) finding that nonspeech sounds, such as telephones and printers, can cause as
much disruption as irrelevant speech sounds is of particular interest. This has clear
implications for office work, particularly with the increasing trend by corporations to move
toward open-plan offices. Not only is the performance of office workers likely to be affected
by conversations of their co-workers but also by the office equipment they have to use. Office
planners need to find ways of reducing subjective noise levels, for example by providing
adequate partitioning and sound insulating materials. Alternatively, they could consider
introducing a continuous noise that serves to mask not only the background speech, but also
the equipment noise, so that these sounds become less distinct from one another (Jones and
Macken, 1995).
Other settings, such as control centres and aircraft cockpits, have received less scrutiny in
this respect1. Considering that much of the complex, error-critical decision making in the
military and civil environments is undertaken in noisy aircraft cockpits and control centres, it
is surprising that relatively little research to date has been conducted in these domains.

1Experiments currently in progress at DERA (Farnborough) are investigating the effects of extraneous
sounds in aircraft cockpits and airborne control centres (SB).
486 S Banbury and D Jones

Nevertheless, it is clear that any reduction in disruption by background sounds is only


possible through a better understanding of why background sounds are disruptive, rather than
reducing the intensity of the sounds per se.

References
Banbury. S., & Berry, D.C. (in press). Disruption of office related tasks by speech and office
noise. British Journal of Psychology.
Banbury. S., & Berry, D.C. (1997). Habituation and dishabituation to speech and office noise.
Journal of Experimental Psychology: Applied, 3, 1–16.
Colle, H.A. (1980). Auditory encoding in visual short-term recall: Effects of noise intensity
and spatial location. Journal of Verbal Learning and Verbal Behavior, 19, 722–735.
Colle, H.A., & Welsh, A. (1976). Acoustic masking in primary memory. Journal of Verbal
Learning and Verbal Behaviour, 15, 17–31.
Jones, D.M. (1995). The fate of the unattended stimulus: Irrelevant speech and cognition.
Applied Cognitive Psychology, 9, 23–38.
Jones, D.M. (1993). Objects, streams and threads of auditory attention. In A.D.Baddeley &
L.Weiskrantz (Eds.), Attention: Selection, awareness and control. Oxford: Clarendon
Press.
Jones, D.M. (1990). Recent advances in the study of human performance in noise.
Environment International, 16, 447–458.
Jones, D.M., Alford, D., Bridges, A., Tremblay, S., & Macken, W.J. (1997). Organisational
factors in selective attention: The interplay of acoustic distinctiveness and auditory
streaming in the irrelevant sound effect. (Manuscript submitted for publication).
Jones, D.M., Farrand, P., Stuart, G., & Morris, N. (1995). The functional equivalence of
verbal and spatial information in serial short-term memory. Journal of Experimental
Psychology: Learning, Memory and Cognition, 21, 1008–1018.
Jones, D.M., & Macken, W.J. (1993). Irrelevant tones produce an irrelevant speech effect:
Implications for phonological coding in working memory. Journal of Experimental
Psychology: Learning, Memory and Cognition, 19, 369–381.
Jones, D.M., & Macken, W.J. (1995). Auditory babble and cognitive efficiency: Role of
number of voices and their location. Journal of Experimental Psychology: Applied, 1,
216–226 .
Jones, D.M., Macken, W.J., & Murray, A.C. (1993). Disruption of visual short-term memory
by changing-state auditory stimuli: The role of segmentation. Memory and Cognition,
21, 318–328.
Jones, D.M., Madden, C., & Miles, C. (1992). Privileged access by irrelevant speech to short-
term memory: The role of changing state. Quarterly Journal of Experimental
Psychology, 44A, 645–669.
Jones, D.M., Miles, C., & Page, J. (1990). Disruption of reading by irrelevant speech: Effects of
attention, arousal or memory? Journal of Applied Cognitive Psychology, 4, 645–669.
LeCompte, D.C., Neely, C.B., & Wilson, J.R. (in press). Irrelevant speech and irrelevant
tones. Journal of Experimental Psychology: Learning, Memory and Cognition.
Martin, R.C., Wogalter, M.S., & Forlano, J.G. (1988). Reading comprehension in the
presence of unattended speech and music. Journal of Memory and Language, 27, 382–
398.
Salamé, P., & Baddeley, A.D. (1982). Disruption of short-term memory by unattended
speech: Implications for the structure of working memory. Journal of Verbal Learning
and Verbal Behavior, 21, 150–164.
Tremblay, S., & Jones, D.M. (1998). The role of habituation in the irrelevant sound effect:
Evidence from the effects of token set size and rate of transition. Journal of
Experimental Psychology: Learning, Memory and Cognition (in press).
TRANSMISSION OF SHEAR VIBRATION THROUGH GLOVES

Gurmail S.Paddan and Michael J.Griffin

Human Factors Research Unit


Institute of Sound and Vibration Research
University of Southampton
Southampton
SO17 1BJ England

International Standard ISO 10819 (1996) proposes a test method for the
measurement of the vibration transmission of gloves but does not address the
testing of gloves during exposure to vibration in a shear axis. Experiments
have been conducted to measure the frequency transmissibility of shear axis
vibration through a selection of gloves to the palm of the hand. Ten gloves,
some ‘antivibration gloves’, were used in the investigation. The subjects (8
males) pushed on a horizontal handle with a force of 20 Newtons; no grip
force was applied. The frequency-weighted magnitude of random horizontal
vibration on the handle was 5 ms–2 r.m.s.; vibration was measured at the
vibrating handle and at the palm of the hand using a palm-glove adaptor.
Transmissibilities between the handle and the palm adaptor are presented for
all gloves and all subjects. It was found that a few gloves gave resonances at
about 200Hz and attenuated vibration at frequencies above about 300Hz.
Other gloves had resonances at higher frequencies and some gloves
amplified the shear axis vibration at all frequencies up to about 1000 Hz.

Introduction
The hands of operators are exposed to multi-axis vibration when using vibrating tools. The
directions of the vibration include the three translational axes: fore-and-aft, lateral and
vertical, as specified in British Standard BS 6842 (1987). Some tools expose the hands of
operators to vibration predominantly in an axis parallel to the surface of the hand: vibration in
a shear axis. Examples include a percussive chipping hammer when holding the chisel.
International Standard 10819 (1996), which might be used to determine the vibration
transmissibility of a glove, considers the transmission of vibration through the palm area of
the glove to the hand in a direction perpendicular to part of the palm. With the glove worn on
the hand, the horizontally vibrating handle is held such that acceleration occurs along the
forearm, that is, in the z-axis of the hand. The measures obtained according to ISO 10819 are
used to determine whether a glove can be considered to be ‘antivibration glove’. However,
International Standard ISO 10819 (1996) does not address the testing of gloves during
exposure to vibration in a shear axis.
This paper presents data on the transmission of vibration in the shear axis as a function of
frequency for a selection of gloves. The transmission of shear axis vibration through gloves
488 GS Paddan and MJ Griffin

has received little previous attention, even though many vibrating tools expose the hands of
workers to high levels of shear vibration. The variability in the transmission of shear
vibration between subjects and between gloves has been determined. The data presented in
this paper are taken from a larger study which also investigated the effect of push force on the
transmission of shear vibration.

Equipment and Procedure


The experiment was conducted using an electrodynamic vibrator, Derritron type VP30,
powered by a 1500 watt amplifier. A basic handle comprising a steel bar of diameter 32mm
and length 102mm was attached to the vibrator such that the grip of the hand would be
horizontal and in line with axis of vibration. The first resonance of the handle occurred at
approximately 1340Hz.
Acceleration was measured at two locations: on the vibrating handle, and between the
palm of the hand and the glove using a palm adaptor of mass 9.21 grams (ISO 10819 states a
maximum mass of 15 grams). The accelerometers were of piezoelectric type (Brüel and Kjær
type 4374) each with a mass of 0.65 gram. The acceleration signals from the two locations
were passed through charge amplifiers (Brüel and Kjær type 2635) and then acquired into a
computer-based data acquisition and analysis system.
The subjects stood on a horizontal surface and applied a downward force with the right
hand on to the laterally vibrating handle. The subjects held their forearms horizontal at an
angle of 90° to the axis of vibration. The gloved hand was placed on the handle such that the
metacarpal bones were horizontal and at right angles to the axis of vibration. The elbow
formed an angle of approximately 180° between the forearm and the upper arm. There was no
contact between the elbow and the body during the measurements. A downward push force of
approximately 20N was applied during the measurements; no grip force was applied. A copy
of the written instructions given to subjects is shown in the Appendix.
Ten commercially available gloves were tested (see Paddan, 1996, for details of gloves).
In accord with ISO 10819 (1996), the gloves were worn by the subjects for at least 3 minutes
prior to the vibration measurements. The room temperature during the tests fluctuated
between 22°C and 25°C (the standard specifies a temperature range of 20+5°C) and the
relative humidity varied between 37% and 51% (the standard specifies that the relative
humidity shall be below 70%).
The experiment was approved by the Human Experimentation Safety and Ethics
Committee of the Institute of Sound and Vibration Research. Eight right-handed male
subjects participated in the inter-subject variability study (mean age 28.75 years; mean
weight 71.25kg; mean height 1.78m). Each subject was exposed to the vibration eleven times:
once with the ungloved hand and once with each of the ten gloves.
A commercial data acquisition and analysis system, HVLab, developed at the Institute of
Sound and Vibration Research of the University of Southampton, was used to conduct the
experiment and analyse the acquired data. A computer-generated Gaussian random waveform
having a nominally flat acceleration spectrum was used with a frequency-weighted
acceleration magnitude of 5.0 ms–2 r.m.s. at the handle. The frequency weighting used was Wh
as defined in British Standard BS 6842 (1987). The frequency range of the input vibration
was 6 Hz to 1800 Hz. The waveform was sampled at 6097 samples per second and low-pass
filtered at 1800 Hz before being fed to the vibrator. Acceleration signals from the handle and
Transmission of shear vibration through gloves 489

the palm adaptor were passed through signal conditioning amplifiers and then low-pass
filtered at 1800 Hz via anti-aliasing filters with an elliptic characteristic; the attenuation rate
was 70 dB/octave in the first octave. The signals were digitised into a computer at a sample
rate of 6097 samples per second. The duration of each vibration exposure was 5 seconds.

Analysis
Transfer functions were calculated between acceleration on the handle (i.e. the input) and
acceleration measured at the palm-glove interface adaptor (i.e. the output). The ‘cross-
spectral density function method’ was used. The transfer function, Hio(f), was determined as
the ratio of the cross-spectral density of input and output accelerations, Gio(f), to the power
spectral density of the input acceleration, Gii(f): Hio(f)=Gio(f)/Gii(f). Frequency analysis was
carried out with a resolution of 5.95Hz and 124 degrees of freedom.

Results and Discussion


Figure 1 shows individual shear transmissibilities between the handle and the adaptor for the 8
subjects pushing on the handle with a force of 20N. The palm adaptor was inserted between the
palm of the hand and the glove. The transmissibilities show different dynamic characteristics
for the 10 gloves. However, there appear to be two main categories for the data shown: gloves
which demonstrate a low frequency peak in transmissibility (i.e. below about 300Hz) and gloves
which show a peak in transmissibility over the frequency range 300Hz to 700Hz. Gloves which
show a low-frequency peak in transmissibility are gloves 2 and 10; these show significant
attenuation of vibration for frequencies above about 400Hz. All the other gloves could be put
into the second category. (Gloves 2 and 10, together with glove 6, showed low-frequency peaks
in transmissibility when tested using vibration perpendicular to the palm, see Paddan, 1996.)
It is interesting to note that glove 5 showed almost no attenuation in vibration at the
adaptor compared to the vibration on the handle. Indeed, the glove amplified the shear
vibration from the handle to the adaptor by more than 50% at the peak transmissibility
between 600Hz to 1000Hz.
Variation in glove transmissibility in the shear axis between individuals is apparent in Figure 1.
An example of the variation is seen for glove 3 where one subject showed a transmissibility of
0.18 at 1200Hz whereas another subject showed a transmissibility of 0.96: the maximum response
being over 5 times the minimum transmissibility. Variation of similar order has been seen for
glove transmissibilities perpendicular to the surface of the palm (Paddan and Griffin, 1997).
Median transmissibilities for the ten gloves were calculated and are presented in Figure 2.
A large variation in transmissibility is apparent between gloves. At a frequency of 875Hz,
glove 5 (highest curve) shows a median transmissibility that is over 19 times greater than the
median transmissibility for glove 2 (lowest curve).
In view of the high levels of shear vibration to which the operators of vibrating tools are
exposed (see for example Nelson, 1997), the revision of ISO 10819 (1996) should consider
the additional measurement of the transmission of vibration in shear axes. A factor that is of
importance in the transmission of shear vibration through the glove is the effect of grip force;
some vibrating tools require operators to hold tools with high grip forces. Other work has
shown that an increase in force results in increased transmission of vibration through the
glove to the palm of the hand.
490 GS Paddan and MJ Griffin

Figure 1. Glove shear transmissibilities for 8 subjects pushing with a force of 20N
(5.95Hz frequency resolution, 124 degrees of freedom)

Figure 2. Median shear vibration transmissibilities for 10 gloves

The effect of the transmission of shear vibration on the hands has received little
attention. Since the vibration transmitted to the hand in the shear direction can be as great as
the vibration occurring perpendicular to the palm, this is a topic which requires further
consideration.
Transmission of shear vibration through gloves 491

Conclusions
A large variation in shear transmissibilities has been found for the 10 gloves tested. Some
gloves transmitted only low frequency vibration (below about 300Hz) while other gloves also
transmitted frequencies well above about 300Hz. Some gloves offer no attenuation of shear
axis vibration at any frequency below 1000Hz: they appear to amplify shear axis vibration at
all relevant frequencies.

References
British Standards Institution 1987, British Standard Guide to Measurement and evaluation of
human exposure to vibration transmitted to the hand, BS 6842. London: BSI
International Standards Organization 1996, Mechanical vibration and shock—Hand-arm
vibration—Method for the measurement and evaluation of the vibration
transmissibility of gloves at the palm of the hand. ISO 10819 (1996)
Nelson, C.M. 1997, Hand-transmitted vibration assessment—a comparison of results using
single axis and triaxial methods. United Kingdom Group Meeting on Human Response
to Vibration, ISVR, University of Southampton, 17–19 September 1997
Paddan, G.S. 1996, Effect of grip force and arm posture on the transmission of vibration
through gloves. United Kingdom Informal Group Meeting on Human Response to
Vibration, MIRA, Nuneaton, 18–20 September 1996
Paddan, G.S. and Griffin, M.J. 1997, Individual variability in the transmission of vibration
through gloves. In S.A.Robertson. (ed.) Contemporary Ergonomics 1997, (Taylor and
Francis, London), 320–325

Acknowledgements
This work has been carried out with the support of the United Kingdom Health and Safety
Executive.

Appendix
Following are instructions that were given to the subjects taking part in the experiments on
the transmission of shear vibration through gloves.

INSTRUCTIONS TO SUBJECTS
SHEAR VIBRATION: EFFECT OF PUSH FORCE

The aim of this experiment is to measure the effect of push force on the transmission of
shear vibration through gloves to the palm of the hand.
Please stand and rest your right hand on the handle in front such that the forearm is horizontal.
Ensure that your right arm (upper and lower) is not in contact with your body. Throughout each
vibration exposure, you are required to apply a downward push force on the handle.
You must ensure that the adaptor is inserted between the glove and the palm of the hand,
and positioned such that the accelerometer in the adaptor is in line with the direction of
vibration. Just prior to the start of each run, which the experimenter will indicate, you are to
place your right hand on the handle and apply the required push force. This position is to be
maintained throughout the vibration exposure.
You are free to terminate the experiment at any time.
Thank you for taking part in this experiment.
THE EFFECT OF WRIST POSTURE ON ATTENUATION OF
VIBRATION IN THE HAND-ARM SYSTEM

Tycho K.Fredericks1 and Jeffrey E.Fernandez2

1Department of Industrial and Manufacturing Engineering


Human Performance Institute, Western Michigan University
Kalamazoo, MI 49002–5061 USA

2Deparment of Industrial and Manufacturing Engineering

National Institute for Aviation Research, Wichita State University


Wichita, KS 67260–0035 USA

Eight Male University students served as subjects in this experiment.


Subjects were required to perform a simulated riveting task in three wrist
postures (neutral, 1/3 maximum flexion, 2/3 maximum wrist flexion) while
vibration was measured at the hand-handle interface and the styloid process
of the ulna. Results indicated that wrist posture had a significant effect on
vibration transmitted from the hand-handle interface to the wrist. The neutral
wrist posture, in the majority of the cases, was associated with the highest
degree of dampening. Decrements in vibration amplitude from the hand-
handle interface to the wrist were as a high as 88.89%.

Introduction
According to the United States Bureau of Labor Statistics (1997) for 1995, Work-Related
Musculoskeletal Disorders (WMSD) due to repeated trauma have declined over the past year
by 7 percent. This, on the surface, seems encouraging until you link this finding with the lost
time data (time away from work). Carpal Tunnel Syndrome, a WMSD commonly associated
repeated trauma, was documented to have a median of 30 days away from work (BLS, 1997).
Couple the lost time data with the frequency data, and multiply by an indirect cost ranging
from $60,000–$100,000 (CTD News, 1993) and it is understandable why industry desires to
keep this illness at bay. For these reasons, the research community has been using the
psychophysical approach in an attempt to develop acceptable work frequencies. A plethora of
previous psychophysical studies have investigated the effects of wrist posture (Marley and
Fernandez, 1995), applied force (Kim and Fernandez, 1993), and gender (Davis and
Fernandez, 1994) on maximum acceptable frequency (MAF). A more recent study
(Fredericks and Fernandez, in press) investigated the effect of vibration, wrist posture, and
applied force on MAF. In this particular study it was determined that MAF decreased
significantly with a deviation in wrist posture and an increase in applied force. It was also
determined that decrements in MAF due to vibration were 36% while decrements due to
wrist posture were 19%. This indicated that vibration transmitted from a rivet gun, as a risk
factor in the development of WMSD, is of more concern than wrist posture. The present
study builds upon that study to determine the effect of wrist posture and on the attenuation of
the vibration from the hand-handle interface to the wrist.
Effect of wrist posture on attenuation of vibration in the hand-arm system 493

Methods and Procedures

Subjects and Design of Experiment


Eight males from the University population served as subjects in this experiment. Each subject
was required to perform a riveting task commonly found in the aircraft industry. Wrist postures
(neutral, 1/3 maximum flexion, 2/3 maximum wrist flexion) were varied in compliance with a
pilot study conducted at an aircraft company located in the United States. For details of that
study refer to Fredericks and Fernandez (in press). Applied force level and coupling force were
held constant. A complete randomized block design with subjects as blocks served as the statistical
design for this experiment. All trials were presented to the subjects in random order.

Equipment
A workstation was designed to simulate sheet metal riveting activities commonly found in the
aircraft industry. A United Air Tool brand pneumatic hand held rivet gun coupled with a pistol
type grip was used as the hand tool to be used in the simulated task. The weight of the tool
was 3.5 pounds. Three tri-axial Endevco 23 accelerometers were used to measure vibrations.
The preamplification of the signal was performed by Endevco Model D 33 series signal
conditioners. The analog-to-digital (A/D) conversion of the vibration signal was performed
Keithley Metrabyte DAS 16F A/D board housed in a Zenith 80286 microcomputer.
Processing of these vibration signals was done using the “Snap Master” signal processing
software. One accelerometer was glued to a transducer mount (Rasmussen, 1982) to measure
vibration levels entering the hands and another accelerometer was glued to a wrist mount
(Farkkila, 1978) to measure vibration levels at the styloid process of the ulna.

Procedures
All tasks were performed with the subjects preferred hand. All anthropometric measurements of
the hand and wrists were also taken using the preferred hand. For each wrist posture, plus a
replication of one posture, a psychophysically adjusted drilling frequency was determined by
the method of adjustment. The subject was allowed to adjust the frequency of the task by adjusting
the up and down arrows on the keyboard for the first 20 minutes. At the end of the 23rd minute,
physiological measures (heart rate, blood pressure) and ratings of perceived exertion were taken.
Then the subject momentarily halted activities while the vibration equipment was put on them
(this was approximately 1.5 minutes). This equipment included the previously described transducer
mount for the hand, and one arm clasp for the wrist as well as the EMG sensors. Subjects then
continued the task and a 2-second sample at 7000Hz recorded digitally were recorded.

Results and Discussion


Descriptive statistics for the eight male subjects are presented in Table 1. A statistical
comparison was made between the subjects height, weight, and grip strength in this study
with those from a larger population (Viswanath and Fernandez, 1992; Ayoub et al., 1985).
Results of t-tests indicate that there were no significant differences between the two. This
could mean that subjects used in this study could be representative of a larger population.
Hand tool vibration was measured in the three orthogonal directions as recommended by
NIOSH (1989). The signals, corresponding to the X, Y, and Z axes, were recorded and processed
several different ways. A root-mean-square (RMS) value was calculated to determine the energy
content of the vibration. Further analysis occurred after a Fast Fourier Transform was performed
494 TK Fredericks and JE Fernandez

to determine acceleration values within specific frequency regions. Frequency regions reviewed
included: 0–100 Hz (FWA), 101–200 Hz (FWB), 201–3500 Hz (FWC), and 0–3500 Hz (FWD).
ANOVA results for the effects of wrist posture on RMS acceleration values, frequency
weighted acceleration values in the region FWA, FWB, FWC, and FWD to the hand-arm system
are presented in Table 2. Although vibration data were collected at the coupling, styloid process
of the ulna, and the proximal end of the ulna bone, only results of the vibration at first 2 locations
are presented in this article. The combined affect of acceleration associated with the vibration in
all three basicentric orthogonal directions was also calculated (NIOSH, 1989).

Table 1. Subject Descriptive Statistics (n=8)

Table 2. ANOVA’s for vibration response variables

A Duncan’s Multiple Range test was performed to determine that allocation of differences
for all vibration response variables significantly (α=0.05) affected by wrist posture. It was
determined for RMS values in the Z-axis, that the neutral wrist posture had significantly
Effect of wrist posture on attenuation of vibration in the hand-arm system 495

higher acceleration values associated with it as compared to 1/3 and 2/3 maximum wrist
flexion. Similar findings were determined in the x-axis (0–100 Hz range), z-axis (0–100 Hz
range), combined effect of all three axis (0–100 Hz range), x-axis (101–200 Hz range), and
combined effect of all three axis (101–200 Hz range). For the combined RMS values, it was
determined that there were no significant differences between acceleration values obtained in
the neutral and 2/3 maximum wrist postures and no significant difference between 2/3
maximum flexion and 1/3 maximum flexion. Similar findings were found in the z-axis in the
101–200 Hz range. In all of the cases tested, the neutral wrist posture had higher acceleration
values associated with it. This could be attributed to the type of contact the hand has with the
handle in the neutral posture versus the other wrist postures. Previously it has been shown
that there is a difference between the impedance of the flat hand pushing a plat (Miwa, 1964
a, b), the impedance of the hand gripping handles of different diameters (Reynolds and
Falkenberg, 1984), and impedance with a palm grip or finger grip around a handle (Reynolds
and Keith, 1977). It is believed that wrist posture had an effect on the contact area of the hand
with the handle thus influencing the vibration transmission characteristics. In addition wrist
posture did not have a significant effect (α=0.05) on vibration greater than 200 Hz thus
implying that vibration higher than 200 Hz was not permitted to pass through the hand. These
findings coincide with Reynolds and Keith (1997).
Table 3 presents the findings of the attenuation of vibration from the hand-handle
interface to the wrist. In the majority of the cases the highest degree of attenuation occurs
when the wrist is in the neutral posture. Decrements in vibration amplitude were seen as high
as 88.89%

Table 3. Percent attenuation from hand-handle interface to wrist.


496 TK Fredericks and JE Fernandez

Conclusion
Wrist posture has a significant effect on vibration attenuation. The neutral wrist posture has
the highest attenuation associated with it. Engineers should consider different means to
decrease the vibration transmitted through the hand-arm system. Investigation of alternative
handle shapes, materials, and tool mechanics may decrease the vibration transmitted through
the hand-arm system thus reducing the risk of WMSD.

References
Ayoub, M.M., Smith, J.L., Selan, J.L., Chen, H.C., Fernandez, J.E., Lee, T.Y., and Kim,
K.H. (1985). Lifting in unnatural postures. Instititute for Ergonomics Research, Texas
Tech University, Lubbock, Texas.
Davis, P.J. and Fernandez, J.E. (1994). Maximum acceptable frequencies performing a
drilling task in different wrist postures. Journal of Human Ergology, (23)2, 81–92.
Farkkila, M.A. (1978). Grip force in vibration disease. Scand. J. Work Environ. Health, 4,
159–166.
Fredericks, T.K. and Fernandez, J.E. (in press). The Effect of Vibration on Pyschophysically
Derived Work Frequencies for a Riveting Task”, International Journal of Industrial
Ergonomics.
Kim, C.H. and Fernandez, J.E. (1993). Psychophysical frequency for a drilling task.
International Journal of Industrial Ergonomics, 12, 209–218.
Marley, R.J. and Fernandez, J.E. (1995). Psychophysical frequency and sustained exertion at
varying wrist posture for a drilling task. Ergonomics, 38(2), 303–325.
Miwa, T. (1964a). Studies in hand protectors for portable vibrating tools. 1. Measurements
of the attenuation effect of porous elastic materials. Industrial Health, 2:95–105.
Miwa, T. (1964b). Studies in hand protectors for portable vibrating tools. 2. Simulations of
porous elastic materials and their applications to hand protectors. Industrial Health,
2:106–123.
NIOSH (1989). Occupational Exposure to Hand-Arm Vibration. Cincinnati, OH:
Department of Health and Human Services, Public Health Services, Centers for
Disease Control, National Institute for Occupational Safety and Health, Division of
Standards Development and Technology Transfer, DHHS (NIOSH) Publication No.
89–106.
Rasmussen, G. (1982). Measurement of vibration coupled to the hand-arm system. In
Brammer, A.J., Taylor, W., Eds. Vibration effects on the hand and arm in industry.
New York, NY: John Wiley & Sons, 157–167.
Reynolds, D.D. and Keith, R.H. (1977). Hand-arm vibration. Part I. Analytical model of the
vibration response characteristics of the hand. J. Sound and Vibration, 51(2):237–253.
Reynolds, D.D. and Falkenberg, R.J. (1984). A study of hand vibration on chipping and
grinding operators. Part II: Four-degrees-of-freedom lumped parameter model of the
vibration response of the human hand. J. Sound and Vibration, 95(4):499–514.
Tallying the true costs of on the job CTDs, CTD News, 1993.
U.S. Department of Labor, Bureau of Labor Statistics. (1997). Occupational Injuries and
Illnesses in the United States by Industry, 1995. Washington, D.C.: U.S. Government
Printing Office.
Viswanath, V. and Fernandez, J.E. (1992). MAF for males performing drilling tasks.
Proceedings of the 36th Annual Human Factors Society Meeting, pp. 692–696.
HAND TOOLS
CRITERIA FOR SELECTION OF HAND TOOLS IN THE
AIRCRAFT MANUFACTURING INDUSTRY: A REVIEW

Bheem P.Kattel and Jeffrey E.Fernandez

Department of Industrial and Manufacturing Engineering


National Institute for Aviation Research
Wichita State University
Wichita, KS 67260–0035, USA

There is no evidence of extensive research on the selection criteria of hand


tools for specific tasks. Tools ergonomically suited for one task, individual,
or environment may not be suitable for another environment or individual.
The success in reducing this mismatch would depend on a reliable selection
criteria. The heavy expenses incurred by aircraft manufacturing industries in
purchasing hand tools (especially rivet guns) needs to be justified from the
point of view of economics as well as the quality of life of the employees.
This study reviews the literature related to the selection criteria of hand tools
for different types of work. Some factors recommended for consideration
while selecting hand tools for aircraft manufacturing industry include size,
weight, shape, forces exerted, and vibration.

Key Words: Rivet guns, Vibration, Hand-arm, Musculoskeletal Disorders.

Introduction
Hand tools find their use in all types of environment, in kitchen, in garages, or in industries.
Type, size, and shape of hand tools depend on the nature of task to be performed. They may
be manually operated or may be manually operated but power driven. Improper use of hand
tools has been found to cause a variety of problems. The main effects of improper use would
be felt in the upper-extremity.
Recent trend in the manufacturing sector has been the production of ergonomically
designed products. The market is flooded with products claimed by the manufacturers to be
ergonomically sound. The hand tools are no exception to this trend. There is no recorded
evidence of these hand tools being evaluated from all aspects to justify the cost. Tools which
are ergonomically suited for one task, individual, or environment may not be suitable for
other tasks, individuals, or environments.
Criteria for selection of hand tools in the aircraft manufacturing industry 499

Modern industrial development requires the production of hand tools to be a significant


activity. However, poor design combined with excessive use makes the hand tools potential
causative factor for the development of CTDs of the hand, wrist, and arm (Armstrong, 1983;
Aghazadeh and Mital, 1987). Usually tools regarded as best for the job are the ones that minimize
physical stress by producing low demands of force on the hand, that are not awkward to hold
and handle, and that minimize shock, recoil, and vibration (Radwin and Smith, 1995).

Factors affecting hand tool selection

Posture
Kim and Fernandez (1993) concluded from their study on psychophysical frequency for a drilling
task that task frequency for a drilling operation should be lowered as force and wrist flexion
angle are increased. Halpern and Fernandez (1996), Fernandez et al, (1991), and Klein and
Fernandez (1997), while exploring the effect of arm and wrist posture on pinch strength concluded
that elbow posture had a significant effect on chuck pinch strength while the shoulder posture
did not. Posture affected the endurance time as well. Similarly, Kim et al, (1992) and Fernandez
et al, (1993) concluded that wrist flexion had significant effect on grip strength.

Force
The force requirements for a job are often related to the weight of the tools being handled.
Force demands that exceed an operator’s strength capabilities may cause loss of control,
leading to an unintentional injury and poor work quality (Radwin and Smith, 1995). The
force requirement can be classified into grip force and applied force. Dahalan and Fernandez
(1993) from a study on the psychophysical frequency for a gripping task concluded that
maximum acceptable frequency for a gripping task significantly decreased as the required
gripping force and duration of grip increased.

Repetitiveness
The risk of developing a hand or wrist disorder is significantly increased for workers
performing highly repetitive and forceful exertions (Silverstein et al, 1987). Silverstein et al,
(1986) and Rodgers (1986) have reported that cycle time less than 30 seconds could be
considered as highly repetitive.

Contact stress
Some areas of the body are better suited for bearing contact stress than others. It is related to
the force and area of contact, described by the pressure exerted against the skin (Radwin and
Smith, 1995). Fransson-Hall and Kilbom (1993) from a study on the sensitivity of the hand to
surface pressure concluded that the most sensitive areas were the thenar area, the skinfold
between thumb and index finger and the area around os pisiforme.

Vibration
Burdorf and Monster (1991) investigated riveters and controls in an aircraft company for the
effect of vibration exposure and health complaints. The results of the cross-sectional study
provided some evidence that the use of impact power tools could result in neurovascular
symptoms and damage of bones and joints in the hand-arm system.
500 BP Kattel and JE Fernandez

Tool weight and load distribution


Generally, the tool weight should be less than 2.3kg. However, for precision operations it
should be less than 0.4kg (1.00lb) (Rodgers, 1986). According to Chaffin and Andersson
(1993) the combined weight of tools, hose, and power cord are not uncommon for
commercial drills, sanders, buffers, etc. The effect of weight is further aggravated by
additional muscle actions necessary to precisely position and stabilize a tool during
operation.

Triggers
Shut-off mechanisms used in the power driven hand tools could affect the stress generated in
the hand-arm system while using the tool. Kihlberg et al, (1993) showed significant
differences in the arm movements and ground reaction forces between three tools with
instantaneous shut-off, more slowly declining torque curve, and maximum torque maintained
for some time before shut-off. Smallest values were found with the fast shut-off tool while
the delayed shut-off tool caused the largest values.

Feed and reaction force


A tool with a rapid coupling obtains better results operating on hard joints than tools with
slower disengaging functions (Lindqvist, 1993). Frievalds and Eklund (1993) concluded from
a study that using electrical tools at lower rpm levels and under-powering the pneumatic tools
would result in larger impulses and more stressful ratings.

Handles
Khalil (1973) indicated that there was a decrease in the muscle effort for a given torque as the
diameter of the handle varied from 1.25 inches. According to Shih and Wang (1996), among
different handle shapes, triangular shape was the most favorable, followed by square,
hexagonal and circular. Cochran and Riley (1986) concluded that for tasks involving thrust
push or pull type activities, the best handle shape would be triangular followed by rectangular
and handles with square and circular cross-section were the worse. According to Johnson
(1988), the best grip diameter which required less effort, was 5cm. According to Fernandez et
al, (1991), an increase in handle diameter could cause an increase in the maximum wrist
flexion but at a lower grip strength.

Grip span and type


Fransson and Winkel (1991) concluded that a span of 50–60mm for females and 55–65mm
for males would produce maximum resultant force between the jaws of the tool for both
conventional and ‘reversed’ grip types. Kilbom et al, (1993) showed that during one minute
repetitive handgrip exercise with simultaneous demands on force and moderate demands on
precision, subjects could use 40–50% of their maximal grip force.

Friction
Higher friction between hand and handle will facilitate good grip and hence, less grip force
exertion. Bobjer et al, (1993) concluded from a study that the coefficient of palmar friction
was dependent on the size of the surface areas in contact with the skin and there was a low
correlation between coefficient of friction and perceived discomfort.
Criteria for selection of hand tools in the aircraft manufacturing industry 501

Recommendation for the aircraft industry


Based on the literature review presented above, Table 1 gives the summary of criteria
recommended for selection of hand tools in the aircraft manufacturing industry.

Table 1. Factors and mode of measurement to be considered for evaluation of rivet


guns.

Concluding Remarks
Proper selection of hand tools can reduce the risk of developing CTDs of the upper
extremities. Available literature mostly deal with studies performed to establish safe limits
for various physical characteristics of hand tools. Further research is required to establish
criteria for selecting hand tools for specific operations.

References
Aghazadeh F. and Mital A. 1987, Injuries due to hand tools. Applied Ergonomics, 18(4), 273–278
Armstrong T.J. 1983, An ergonomic guide to carpal tunnel syndrome. American Industrial
Hygiene Association Journal, 43(2), 103–116
Bobjer, O., Johansson, S-E. and Piguet, S. 1993, Friction between hand and handle: Effects
of oil and lard on textured and non-textured surfaces; perception of discomfort.
Applied Ergonomics, 24(3), 190–202
Burdorf A. and Monster A. 1991, Exposure to vibration and self-reported health complaints
of riveters in the aircraft industry. Annals of Occupational Hygiene, 35(3), 287–298
Cochran D.J. and Riley M.W. 1986, The effects of handle shape and size on exerted forces.
Human Factors, 28(3), 253–265
502 BP Kattel and JE Fernandez

Dahalan J.B. and Fernandez J.E. 1993, Psychophysical frequency for a gripping task,
International Journal of Industrial Ergonomics, 12, 219–230
Fernandez, J.E, Dahalan, J.B., Klein, M.G. and Kim, C.H. 1991, Effect of handle diameter on
maximum wrist flexion and extension. In W.Karwowski and J.W.Yates (eds.)
Advances in Industrial Ergonomics and Safety III, 1991, (Taylor & Francis, London),
351–157
Fransson C. and Winkel J. 1991, Hand strength: the influence of grip span and grip type,
Ergonomics, 34(7), 881–892
Fransson-Hall C. and Kilbom A. 1993, Sensitivity of the hand to surface pressure, Applied
Ergonomics, 24(3), 181–189
Freivalds A. and Eklund J. 1993, Reaction torques and operator stress while using powered
nutrunners, Applied Ergonomics, 24(3), 158–164
Halpern C.A. and Fernandez J.E. 1996, The effect of arm posture on peak pinch strength,
Journal of Human Ergology, 25(1), 141–148
Halpern C.A. and Fernandez J.E. 1993, The effect of wrist posture and pinch type on
endurance time. In W.Marras, W.Karwowski, J.Smith, and L.Pacholski (eds.) The
Ergonomics of Manual Work, (Taylor & Francis, London), 323–326
Johnson S.L. 1988, Evaluation of powered screwdriver design characteristics, Human
Factors, 30(1), 61–69
Khalil T. 1973, An Electromyographic methodology for the evaluation of industrial design,
Human Factors, 15(3), 257–264
Kihlberg, S., Kjellberg, A. and Lindbeck, L. 1993, Pneumatic tool torque reaction: reaction
forces, displacement, muscle activity and discomfort in the hand-arm system, Applied
Ergonomics, 24(3), 165–173
Kilbom, A., Makarainen, M., Sperling , L., Kadefors, R. and Liedberg, L. 1993, Tool design,
user characteristics and performance: a case study on plate-shears, Applied
Ergonomics, 24(3), 221–230
Kim C.H. and Fernandez J.E. 1993, Psychophysical frequency for a drilling task,
International Journal of Industrial Ergonomics, 12, 209–218
Kim, C.H., Marley, R.J. and Fernandez, J.E. 1992, Prediction models of grip strength at
varying wrist positions. In S.Kumar (ed.) Advances in Industrial Ergonomics and
Safety IV 1992, (Taylor & Francis, London), 783–788
Klein M.G. and Fernandez J.E. 1997, The effects of posture, duration, and force on pinching
frequency, International Journal of Industrial Ergonomics, 20, 267–275
Lindqvist, B. 1993, Torque reaction in angled nutrunners, Applied Ergonomics, 24(3),
174–180
NIOSH, 1989, Criteria for a recommended standard: Occupational exposure to hand-arm
vibration, DHHS (NIOSH) Publication No. 89–106, 86–92
Radwin R.G. and Smith S.S. 1995, Industrial power hand tool ergonomics research: current
research, practice, and needs, DHHS (NIOSH) Publication No. 95–114.
Rodgers, S.H. 1986, Ergonomic Design for People at Work, Vol.2, (Van Nostrand Reinhold,
New York)
Shih Y.C. and Wang M.J.J. 1996, Hand/tool interface effects on human torque capacity,
International Journal of Industrial Ergonomics, 18, 205–213
Silverstein, B.A., Fine, L.J. and Armstrong, T.J. 1987, Occupational factors and carpal tunnel
syndrome, American Journal of Industrial Medicine, 11, 267–274
EXPOSURE ASSESSMENT OF ICE CREAM SCOOPING
TASKS

Patrick G.Dempsey, Raymond McGorry, John Cotnam and Ilya


Bezverkhny

Liberty Mutual Research Center for Safety and Health


71 Frankland Road
Hopkinton, MA 01748 U.S.A.

One area of upper-extremity cumulative trauma disorder research and


practice that has been relatively ignored is the quantification of forces
required by repetitive tasks. While wrist posture and frequency have been
measured with some precision, forces are often evaluated qualitatively. This
study extended a previously developed hand tool kit to specifically measure
grip and scoop head forces on an aluminum ice cream scoop. The
instrumented scoop was used to investigate forces required to scoop nine
different flavors of ice cream between approximately –19° C and –12° C.
Eight subjects participated in the experiment. Of particular interest was the
percentage of maximum voluntary grip contraction forces required. The
results indicate that ice cream scooping demands force exertions that exceed
recommendations for repetitive exertions.

Introduction
Repetitive work with hand tools is thought to be a risk factor for work-related upper-
extremity cumulative trauma disorders (WRUECTDs). The primary task-related risk factors
for WRUECTDs are force, posture and repetition. Repetition can be readily measured
through observation, and wrist posture can be directly measured with reasonable precision by
electro-mechanical goniometers. However, direct measurement of force has been somewhat
ignored, although there have been some efforts to directly measure forces at the hand-to-tool
coupling. In general, assessments of muscular force associated with repetitive tasks involving
the upper extremity have been fairly qualitative. Electromyography (EMG) presents one
alternative, but the difficulties associated with the lack of control afforded by most
workplaces makes EMG an infeasible alternative. Another approach is using force sensitive
resistors at the point(s) of force application. The resistors can be intrusive and must be placed
rather precisely on contact points.
Previously-developed hand tool technology was extended and utilized to directly measure
forces required to scoop nine flavors of ice cream across a range of temperatures. The
primary focus was assessing grip force, due to the involvement of the flexor tendons in
WRUECTDs.
504 PG Dempsey, R McGorry, J Cotnam and I Bezverkhny

Methods

Subjects
Four male and four female subjects voluntarily participated in the experiment. The mean ages
(standard deviation) of the male and female subjects were 34.0 (11.3) and 33.8 (3.3) years,
respectively.

Apparatus
An instrumented ice cream scoop was used to collect data on grip force exerted on the handle
and data on moments acting about the head of the ice cream scoop. Grip force was measured
with two button load cells mounted orthogonally within the handle of the ice cream scoop.
Moments about the center of the handle were measured in two planes using strain gauges
mounted on the interface between the handle and the scoop head. Linear regressions were
used to relate voltages to forces (N) and moments (N•m). The r2 values for the four channels
ranged between 0.997 and 0.999. The signals were passed through a differential amplifier to
gain amplifiers. The gained output was passed to a low-pass filter with a 16Hz cutoff
frequency. The filtered signal was sent to an analog-to-digital converter, a 512k
microprocessor, and over a RS232 line to the serial port of a personal computer. The “dull”
scoop head, the electronic scoop, and an original scoop that the electronic scoop was modeled
on are shown in Figure 1.

Figure 1. Dull scoop head, electronic


scoop, and original scoop.

A mock-up dip-chest was constructed from


wood to simulate a two-lid Kelvinator dip chest
used in ice cream shops. Figure 2 shows a
subject scooping ice cream from the mock-up
dip chest. A digital probe thermometer was used
Figure 2. Experimental setup.
to measure ice cream temperatures immediately
after scooping. Six flavors of ice cream, two flavors of frozen yogurt, and one flavor of sorbet
were used.

Procedures
Subjects were provided limited training. The training covered maintaining a straight wrist
while scooping, scooping straight across the ice cream, and scooping from the highest point
in the box of ice cream to the lowest.
Exposure assessment of ice cream scooping tasks 505

Maximum voluntary contraction (MVC) data for grip force were collected with the
electronic scoop handle at least one day prior to the experimental tasks. The Caldwell
regimen, with modifications used by Berg et al (1988) and Dempsey and Ayoub (1996) for
collecting pinch strength data, was used. The average of two trials within ±15% of each other
was used as the datum.
A range of temperatures was studied. The goal was to test the ice cream at –18°C, –15°C,
and –12°C. Due to difficulties controlling temperature with precision, the testing was
performed over a range of –19.3°C to –12.4°C.
Subjects performed the experimental tasks on three different days. On each day, subjects
performed two replications of scooping for each flavor, for a total of 18 scoops per session.
The order of presentation of the flavors during a session was random. On the three different
days, the ice cream was at different temperatures for each session. Due to differences in
contents, location in the freezer, etc., all flavors tested were not at the same temperature;
however, during any given session all flavors were at similar temperatures relative to the
range of temperatures tested.
Subjects performed the experiment in pairs. One flavor at a time was selected at random
and placed in the mock-up. Each subject scooped two scoops from the box placed in the
mock-up dip chest. Before each scoop was performed, the subject dipped the scoop in a
room-temperature water bath and shook off any excess water. Only one subject was in the
laboratory at a time; the second subject waited outside. The ice cream scoops were placed
into plastic bowls for weighing. One experimenter controlled data acquisition and a second
experimenter weighed the ice cream and transferred the boxes from the freezer to the mock-
up and vice-versa. Once the four scoops were complete, the thermometer was inserted into
the box of ice cream approximately 2.5cm below the surface. The temperature was recorded
once the reading on the thermometer stabilized. The process was repeated for the other
flavors.

Results
Grip force readings for each trial were averaged and divided by each subject’s maximum grip
force (MGF), yielding average %MGF. An unexpected finding was a tendency for subjects
not to exert a higher %MGF at colder temperatures. Ordinary least-squares regression was
used to examine the relationship between temperature and %MGF. A significant relationship
between %MGF and temperature was found for only one flavor. In this case, %MGF
decreased approximately 4% for each 0.6° C increase in temperature. Given the general lack
of significance, the %MGF was calculated across all temperatures for each flavor. The
averages for each flavor ranged from 40.2% to 50.6%. Average values for subjects ranged
between 36.0% and 68.8%.
The total amount of grip force exerted to scoop a given weight of ice cream was analyzed.
The grip forces for each trial were integrated and divided by the weight of the scoop. The
reason for dividing by the weight of the scoop is because the exact amount of ice cream
scooped could not be controlled. Figure 3 shows the relationships between this quantity and
temperature for each flavor (functions are from regressions). For all flavors, temperature
significantly affected integrated grip forces (using α=0.05). With the exception of one flavor,
the second order quadratic terms for temperature were significant. Since the significance
level for the second order term (0.0529) was just slightly over 0.05 for that flavor, the
quadratic function is reported in Figure 3.
The moments the head forces create about the center of the handle were analyzed in the
same manner as the grip forces, i.e. the moments were integrated and divided by the weight
scooped for a particular trial. For all flavors, temperature significantly affected the integrated
scoop moments. Four of the regression functions were linear, and five were second order
quadratics (see Figure 4).
506 PG Dempsey, R McGorry, J Cotnam and I Bezverkhny

An important aspect of scooping is the


amount of time required per scoop of ice
cream. Force and duration of an exertion can
be used to estimate the “acceptability” of a
given exposure pattern, so the amount of
time scooping per gram of ice cream was
analyzed. Figure 5 shows the relationships
between temperature and the amount of time
required per gram of ice cream for the nine
flavors tested. The relationships were second
order for all flavors except one.

Discussion
Figure 3. Relationship between temperature and The experiment reported here provides
integrated grip forces for 9 flavors tested. quantification of the task demands associated
with scooping ice cream. The authors were
unable to find any ergonomic literature
related to scooping ice cream. Likewise, the
study presents quantitative estimates of
exerted grip force while using hand tools.
Individual subjects averaged up to almost
70% of their maximum grip force, which
exceeds guidelines for even occasional
exertions. For example, Putz-Anderson
(1988) recommends limiting the magnitude
of repetitive exertions to less than 30% of
maximum, and limiting all exertions to 50%
of maximum. It should be noted that values
over 90% were observed for individual trials.
These percentages are conservative in that
MGF was collected with a neutral posture.
Wrist deviation (ulnar/radial or flexion/
Figure 4. Relationship between temperature and extension) decreases the prehensile strength
integrated scoop moments for 9 flavors tested. capabilities of the hand (e.g., Dempsey and
Ayoub, 1996; Putz-Anderson, 1988). Thus,
the actual %MGF values may have been
somewhat higher for trials where wrist
deviation was involved.
Figures 3 and 4 indicate that integrated
scoop moments and grip force follow the
same pattern as the time/gram relationships
shown in Figure 5. A somewhat unexpected
finding was that subjects do not necessarily
increase forces when scooping harder
(colder) ice cream, rather the time spent
scooping is increased. Most relationships
between peak forces and moments versus
temperature were not statistically
significant.
Hopefully, the expansion of such direct
Figure 5. Relationship between temperature and measurement technologies will lead to the
time per gram for 9 flavors tested. quantification of task demands associated
Exposure assessment of ice cream scooping tasks 507

with repetitive hand tool use. Direct measurement of force may permit quantitative
relationships between force and WRUECTD outcomes to be established. Such knowledge
would be very beneficial to designers and practitioners. Currently, epidemiological evidence
linking wrist posture, force, and repetition to the incidence and severity of WRUECTDs only
allows for gross judgments concerning risk associated with various work designs. Predictive
models would allow for more quantitative comparisons of different work designs. Likewise,
direct measurement provides the opportunity to gain considerable insight into the human
responses to work with hand tools, some of which are not intuitive.
The relationships presented in Figures 3–5 permit work practices to be developed with
more precision afforded before the experiment. In particular, the graphs show that forces,
moments, and time per gram increase markedly at temperatures below -14°C. This point may
be used as a goal for the coldest temperature that ice cream is scooped at. Often, ice cream is
stored at a temperature colder than it is scooped at, indicating intermediate storage freezers
should have capacity to allow sufficient time for the ice cream to reach serving temperature.
The %MGF and time values for particular temperatures might also be used to suggest work/
rest regimes. Of course, the actual time content of the particular job being examined would be
necessary in addition to the results presented here.
Several limitations of this study should be mentioned. The experiment was performed
with nine flavors of a particular brand of ice cream assumed to provide a representative
sample with regards to additives (e.g., nuts, fruit) and consistencies. It is possible that the
results (in absolute terms) are specific to the flavors tested. However, it does not seem
unreasonable to extrapolate the results, at least qualitatively, to other brands. Additionally, the
subjects were not highly trained; however, comparison of the results presented here to data
collected in the field indicate that results under actual working conditions are quite similar.

Acknowledgments
The authors would like to thank Richard Holihan, Walter Majkut and Peter Teare for their
assistance with the design, construction, and calibration of the electronic scoop, and
constructing the mock-up dip chest. The authors would also like to thank the eight volunteers
that participated in the experiment.

References
Berg, V.J., Clay, D.J., Fathallah, F.A., and Higginbotham, V.L., 1988, The effects of
instruction on finger strength measurements: Applicability of the Caldwell regimen. In
F.Aghazadeh (ed.) Trends in Ergonomics/Human Factors V, (Elsevier, Amsterdam),
191–198
Dempsey, P.G., and Ayoub, M.M. 1996, The influence of gender, grasp type, pinch width and
wrist position on sustained pinch strength, International Journal of Industrial
Ergonomics, 17(3), 259–273
Putz-Anderson, V. (ed.) 1988, Cumulative trauma disorders: A manual for musculoskeletal
diseases of the upper limbs, (Taylor and Francis, London)
THERMAL
ENVIRONMENTS
THE EFFECT OF CLOTHING FIT ON THE
CLOTHING VENTILATION INDEX

Lisa Bouskill1, Nicola Sheldon 1, Ken Parsons 1 and W R Withey2

(1) Loughborough University


Loughborough LE11 3TU

(2) Centre for Human Sciences


Defence Evaluation Research Agency
Farnborough GU14 0LX

To quantify the effect of clothing ‘fit’ on the exchange of air between


clothing and the environment 9 male subjects undertook 2 exposures to an
environment of ta=5.0 (1SD=0.3) °C, tr=5.0 (0.3) °C, Va=0.12 (0.02) inland
rh=62 (1)%. During one exposure the subjects wore a ‘large’-size, 2-piece
air-impermeable suit and during the other exposure they wore a ‘small’-size
suit of the same design. During both exposures, air exchange—Ventilation
Index—(VI) was measured whilst subjects performed 3 activities: standing
with no movement, stepping on and off a platform and rotating each limb in
turn at a constant rate. VI was 28% higher (P<0.01) in the large suit
compared to the small suit when standing with no movement, and 15%
higher (P<0.01) for all activities combined. It is concluded that the rate of
heat exchange between the skin and the environment depends on clothing fit.

Introduction
It is not always the case that correctly fitting clothing is available for an individual when
required. An inappropriate fit can be detrimental to the wearer’s performance due to a number
of factors. Clothing which is too tight may restrict movement and may rub against areas of
the body causing chaffing. Similarly, clothing which is too loose may present a snag or trip
hazard and may be generally uncomfortable to wear. The incorrect fit of clothing may also
generate problems associated with incompatibility between garments in an ensemble or
between garments and equipment used in association with the clothing.

In terms of clothing design and thermal physiology, the fit of clothing influences both the
micro-environment volume within the ensemble and the ability to generate ‘pumping’ within
it. In turn, these properties influence the ventilation characteristics of an ensemble and thus
the effect the ensemble has on the exchange of sensible and evaporative heat between the skin
and the environment. It has therefore been suggested that “the measurement and control of
the pumping effect seems to be the key for further advances in the field of thermal insulation
of clothing ensembles” (Vogt et al, 1983).
Effect of clothing fit on the clothing Ventilation Index 511

Our study was conducted as part of a larger project to quantify the effects of different
aspects of clothing garments and ensembles on the exchange of heat between the skin and the
environment. Of particular interest was the relationship between values obtained from
standardised measurements in laboratory conditions, such as intrinsic insulation (Ic1) and
evaporative resistance (Iec1), and the ‘resultant’ values of these parameters (Vogt et al, 1983)
obtained when the clothing is worn in workplace situations.

The study used the tracer gas technique to determine the air exchange rate between the
clothing and the environment; when combined with the measurement of micro-environment
volume it provided the Ventilation Index (VI) (Birnbaum and Crockford. 1978). where:

The aim of our study was to examine the effect of fit of clothing on VI. The null hypothesis
(H0) was that VI would not be related to clothing fit. The alternative hypothesis (H1) was that
loose-fitting clothing would have a higher VI than tight-fitting clothing. If a relationship can
be quantified then it will be possible to relate clothing fit, through VI, to its effects on
clothing insulation and on the heat exchange between the wearer and the environment, thus
making evaluations of thermal risk in the workplace more accurate.

Materials and Methods


Subjects
Nine, healthy, physically-active males volunteered to participate in the study. Their physical
characteristics are given in Table 1. They were fully informed of the objectives, procedures
and possible hazards of the study and completed a Form of Consent before exposure. During
the study, left aural temperature and heart rate were recorded as safety measures.

Table 1. Subject physical characteristics. Mean (1SD)

Clothing and Clothing Fit


During the study subjects wore the following clothing: underpants (own), short socks, soft
‘trainers’ and a 2-piece oversuit comprising trousers with an elasticated waist and a zip-
fronted, long-sleeved jacket with elasticated bottom hem. The oversuit was made from
GoreTex ™ material with cotton elasticated, wrist cuffs, and ‘popper’ fastenings at the ankles.
The jacket zip was protected with a ‘popper’-fastened wind baffle. Subjects wore a ‘large’-
size suit and a ‘small’-size suit of the same design during separate exposures. The order of
wearing the suits was randomised.

For this study it was necessary to make an estimate of the way in which the suit fitted the
subjects. Various ways to quantify this fit were considered. For reasons of speed,
practicability and simplicity the following technique was adopted. Seven anatomical
landmarks were used as measuring sites: forearm, upper arm, chest, abdomen, hip, thigh and
512 LM Bouskill, N Sheldon, KC Parsons and WR Withey

lower leg. At each site the ‘excess’ fabric of the suit was ‘pinched’ away from the body, and
the extent of this ‘excess’ measured. The mean of these values was calculated and used as an
indication of the fit of the suit (Table 2).

Determination of the Ventilation Index


VI is the product of the clothing air exchange rate and micro-environment volume. These
were measured in separate test sessions as follows (Bouskill et al, 1997):

Air exchange Rate


The air exchange rate of the ensemble was obtained using a tracer gas technique. Nitrogen
was flushed through the clothing at a constant rate using a system of distribution tubes. The
gas in the micro-environment was sampled using a system of sampling tubes connected to a
small vacuum pump and an oxygen analyser. The time taken for the oxygen concentration in
the micro-environment to return to 19% from 10% was used to calculate the rate of air
exchange between the micro-environment and the external environment.

Micro-environment Volume
The volume of air trapped within each ensemble (ie large-suit and small-suit) was determined
in triplicate in a separate test session. The subject wore over the ensemble a 1-piece air-
impermeable suit, sealed at the neck, which enclosed the entire body including the hands and
feet. Air was evacuated from this suit until pressure, as measured using a U-tube manometer
attached to a perforated tube fed down the ensemble trouser leg, began to change. This was
taken to be the point at which the air in the oversuit had been evacuated such that it just lay on
top of the ensemble. Evacuation continued until no more air could be removed from the suit.
This additional volume of evacuated air was taken to represent the micro-environment volume.

Test Protocol
The test protocol consisted of two separate exposures (one for each size of suit), in a
controlled-environment chamber, to the following thermal conditions: ta=5.0 (1SD=0.3) °C,
tr=5.0 (1SD=0.3) °C, V a=0.12 (ISO=0.02) ms -1 and rh=62 (ISO=1) % (water vapour
pressure=0.54 (ISD=0.01) kPa). During each exposure subjects performed 3 activities:

a. Standing stationary.
b. A ‘step-up’ routine, each step movement taking 1.2 seconds as cued by a metronome.
c. A rotating limbs routine, each limb being moved individually in large arcs, with each arc
taking 4.8 seconds to complete as cued by a metronome.

VI was determined 3 times for each activity, from which an average value was calculated. The
3 measurements were made consecutively, but the order of presentation of activities was
balanced between subjects.

Results
Clothing fit and VI values are given in Table 2. VI increased by 28%, 15%, 8% and 15%
respectively for the standing stationary, stepping, rotate limbs activities and all activities
combined when comparing subject data when wearing the small-suit with the large-suit. In
practical terms this translates to increases in ventilation of 1.9lmin-1, 1.7lmin-1, 1.0lmin-1 and
1.5lmin-1 respectively for the 3 activities and all activities combined.
Effect of clothing fit on the clothing Ventilation Index 513

Preliminary statistical analysis (Student’s t-tests) shows significantly higher values for the
standing stationary activity in the large suit compared with the small suit (P<0.01). However,
for the stepping and rotating limbs activities the differences between suits were not
statistically significant. The mean VI for all activities combined was greater in the large suit
than in the small suit (P<0.05). The relationship between clothing fit and the mean VI for all
activities combined is shown for each subject in Figure 1.

Table 2. Clothing fit and Ventilation Index values when standing with no movement,
stepping and rotating limbs activities with ‘small’ and ‘large’ suits

Figure 1. Relationship between fit of clothing and Ventilation Index when wearing the
large and small suits.
514 LM Bouskill, N Sheldon, KC Parsons and WR Withey

Discussion
The primary aim of this study was to examine the effect of the fit of clothing on the clothing
ventilation index. As expected, each individual had their own unique value of ‘fit’ of the GoreTex
suit, and different values of VI (Figure 1). The overall trend of the data is that loose-fitting
clothing allowed greater ‘pumping’ ie ventilation than did the tight-fitting clothing. The mean
increase when standing stationary was 28% for the large suit compared with the small suit. The
reason for this unexpectedly large difference is not clear, but is presumably related to the large
suit allowing air to exchange through the openings (the neck, cuffs and leg bottoms) by means
of natural convection. This emphasises the importance of this avenue of sensible and evaporative
heat exchange with the environment for maintaining heat balance in the workplace

GoreTex is relatively impermeable to air, so in this study, in which the ambient air speed was
low (0.12 ms-1), most of the air exchange probably took place through the openings alone.
Clothing with a higher air permeability would be expected to allow air exchange by diffusion
through the fabric of the garments. In these circumstances the relative contribution of the
natural convection component would be smaller. The practical consequences of this must be
taken into account when selecting clothing for work in environments in which air speed could
significantly lower resultant insulation, or in selecting a limiting work environment when the
clothing ensemble for that particular workplace is invariable.

During this study it was also observed that when the clothing was very tight the ability of the
subject to perform some tasks was impaired. Subject 2 was restricted, in both the stepping
and rotating limbs activities, because of the tightness of both the small and large suits around
his waist and thighs. This explains why Subject 2 did not achieve a higher VI in the large suit.

This study has shown that it is in the interest of individuals and their employers to ensure that
clothing is of a suitable fit, because ill-fitting clothing restricts movement and also because
pumping may have detrimental physiological effects if the air temperature of the environment
is cold. However, in warm and hot air temperatures pumping may assist heat loss by
convection and evaporation, and thus be of benefit to the wearer.

References
Birnbaum, R.R. and Crockford, G.W. 1978, Measurement of clothing ventilation index.
Applied Ergonomics. 9, 194–200
Bouskill, L.M., Withey, W.R., Watson-Hopkmson. I And Parsons, K.C. 1997, The
relationship between the clothing ventilation index and its physiological effects. In:
Proceedings of Fifth Scandinavian Symposium on Protective Clothing. Elsinore,
Denmark. 5/8 May 1997 R.Nielsen and C Borg (eds.) 36–40.
Vogt, J.J., Meyer J.P., Candas, V., Libert J.P. and Sagot, J.C. 1983, Pumping effect on
thermal insulation of clothing worn by human subjects. Ergonomics, 26, 963–974

The support of the Ministry of Defence. DERA Centre for Human Sciences, is acknowledged.

© British Crown Copyright 1998/DERA. Published with the permission of the Controller of
Her Britannic Majesty’s Stationery Office
A THERMOREGULATORY MODEL FOR
PREDICTING TRANSIENT THERMAL SENSATION

Fangyu Zhu and Nick Baker

The Martin Centre for Architectural and Urban Studies


Department of Architecture, University of Cambridge
Cambridge, CB2 2EB

Comprehensive experiments to investigate human reactions to both transient


and spatially inhomogeneous thermal environments have been reported in
the literature, but the relationship between thermal sensation and
environmental parameters is yet to be explicitly made. In this research
project, a 37-node human thermoregulatory model has been constructed. The
model is able to distinguish the effects of spatial and temporal changes of
conditions around the body. A thermal sensation model will be developed to
translate a given body thermal state into a corresponding thermal sensation
on the ASHRAE seven-point thermal sensation scale. This paper mainly
describes the physiological model.

Introduction
The ISO standard 7730 (1994) is mainly based on Fanger’s research under steady-state
conditions (Fanger, 1970). Fanger produced the PMV and PPD indices; PMV index predicts
the thermal sensation and PPD index predicts thermal discomfort. However dissatisfaction
with air-conditioned buildings designed according to the standard is widespread. The rigid
indoor temperature limits lead to needless heating and cooling which typically cause high
consumption of energy. Furthermore the sick building syndrome (SBS) is common in these
buildings.
A new philosophy of building design is emerging with an interest in the use of structure
and forms to moderate the climate, allowing the temperature to fluctuate within limits during
the day. Recently developed task-conditioning systems function by creating transient and
highly asymmetric environments around the workstation (Heinemeier et al., 1990). By
deliberately departing from the conventional goal of (HVAC) practice of steady, isothermal,
low-speed air flow uniformity within the entire room, task-conditioning designs have
highlighted the shortcomings of Fanger’s numerical models of human thermal balance, which
only resolve the steady state heat and mass fluxes at whole-body level.
This research project includes the development of two models. A 37-node human
thermoregulatory model has being developed and validated against the experimental results.
516 Fangyu Zhu and Nick Baker

A thermal sensation model will be built to transform the multivariable thermal state into a
single thermal sensation index. The process by which a given thermal environment produces
a corresponding thermal sensation is illustrated in Figure 1.

Figure 1. Thermoregulatory and thermal sensation model

The Thermoregulatory Model

Main feature of the model


Human Body Thermoregulatory Model (HBTM) is based mainly on Stolwijk’s 25-node
model (Stolwijk, 1971). In order to represent a person in a real room, several modifications
have been made to the original model, where the environment was assumed to be isothermal
and the body naked. Clothing is represented by two additional layers in each segment. Heat
and humidity transfer resistance and ‘pumping effect’ due to movements are considered.
Convective and radiative heat loss is computed at different segments. The heat transfer
coefficients are calculated from the experimental results of de Dear et al (1997), which were
measured from the articulated thermal manikin. The evaporative heat loss from the skin and
heat loss by respiration were considered to be constant in the original model. In this model,
they are computed from the equations in ASHRAE Handbook of Fundamentals (1989). The
model has been written in FORTRAN and runs quickly on the UNIX system.

Validation of the model


Several researchers have measured body temperatures of subjects, who were dressed in shorts
and in the sitting-resting position (Thellier, 1994). The subjects were exposed for 1 hour at a
neutral temperature of 28°C, then quickly transferred for a 2 hour exposure to a hot or cold
condition. The temperatures were taken after one hour’s exposure to the second environment.
Presented in Figure 2 are the two measured mean skin temperatures and the predictions by
Thellier’s model, along with those from HBTM. Figure 3 shows the experimental values
(Wang, 1992) and predicted results of head, trunk and feet skin temperatures. The
aggreement between computed and measured results is reasonable. There is a change in the
slope of the predicted curve between 29°C and 31°C, when thermoregulatory reactions
against cold are changed into reactions against heat.
Wang (1992) conducted experiments of sudden temperature changes, which were
simulated by HBTM. Although several experiments have been used for comparison, only one
is described here. After 30 minutes of staying at an ambient temperature of 25°C, the subjects
stayed for 60 minutes at an ambient temperature of 35°C, and an additional 30 minutes of
recovery in a 25°C environment. The clothing insulation and mass of the six segments are
presented in Table 1.
Thermoregulatory model for predicting transient thermal sensation 517

Figura 2. The predicted and measured mean skin temperatures in quasi-steady state

Figure 3. The predicted and experimental head, trunk and feet skin temperatures in
quasi-steady state

Figure 4. Mean skin temperature, head and trunk core temperatures in transient
conditions
518 Fangyu Zhu and Nick Baker

The computed and measured mean skin temperatures, and the calculated head and trunk
core temperatures are demonstrated in Figure 4. There is a reasonable agreement between
simulated and observed values, but the model seems to react a bit faster than reality. Although
this phenomenon has been identified by many authors, a good explanation has not been
found yet.

Table 1. The clothing insulation and mass of the six segments

Thermal sensation model


Humphreys and Nicol (1996) pointed out that the PMV equation employs different comfort
criteria for estimating conditions which would yield thermal neutrality, and for assessing the
effects of departures from neutrality. It was shown that this difference of criteria undermines
the estimation of PMV wherever the clothiong insulation differes from 0.6 clo. They
considered that the above weakness comes from the misleading idea of thermal load, which is
the central concept of PMV. We think that the further reason for the above weakness might be
the limitation of Fanger’s physical model. It only includes the passive system and does not
include the control system. We know that human body is capable of maintaining heat balance
within relatively wide limits of the environment variables. There is a neutral comfort interval
for people, where the control system can be neglected. Therefore Fanger’s PMV is a good
predictor for optimal comfort state. But when the environment changes from the neutral
comfort conditions, the effort necessary for thermoregulation increases, and the control
system cannot be neglected. In dynamic state, the control system works all the time. The
mean skin temperature and the sweat secreation cannot be kept at the comfort values, so
Fanger’s thermal load does not exist. Because both the control and passive system are
considered in HBTM, it can overcome the limitation of Fanger’s method mentioned above.
To complete the development of a thermal sensation model, it is necessary to devise a
relationship between the thermal sensation and the thermal state. It is called new predicted
mean vote (NPMV)

Where
Q is the metabolic heat production (W/m2)
Q0 is the neutral metabolic heat production (W/m2)
BF is the blood flow rate (1/h·m2)
BF0 is the neutral blood flow rate (1/h·m2)
Es is the sweat secretion rate (W/m2)
Es0 is the neutral sweat secretion rate (W/m2)
S is the net heat storage (W/m2)
K1, K2, K3 and K4 are appropriate weighting factors.
Thermoregulatory model for predicting transient thermal sensation 519

NPMV is based on the following two assumptions:


(1) a person at steady state in a neutral comfort environment expends minimal effort for
thermal regulation and the net heat storage is zero;
(2) as the environment changes from the above condition the effort necessary for
thermoregulation increases, and the changes in thermal sensation may be correlated with
changes in the effort required for thermoregulation and the net heat storage.
The best values for the neutral values and the weighting factors will be found from
correlating the thermoregulatory model results with experimental data. These values depend
on the metabolic rate.

Conclusion
The thermoregulatory model presented in this paper can be used to predict the global and
local physiological parameters of the human body. It considers the local interactions between
the body and the environment. Validation work, using independent experimental
measurements showed satisfactory agreement with predictions of skin temperatures. Further
refinements can be made as more and better experimental data become available. A thermal
sensation model is currently under development. The whole model will be able to predict
thermal sensation in transient and asymmetric thermal environment.

References
ASHRAE 1989, Handbook of Fundamentals SI Edition, (American Society of Heating,
Refrigerating and Air Conditioning Engineers, Inc., Atlanta)
de Dear, R.J., Arens, E., Zhang H. and Oguro, M. 1997, Convective and radiative heat
transfer coefficients for individual human body segments. International Journal of
Biometeorology, 40, 141–156
Fanger, P.O. 1970, Thermal Comfort. (Danish Technical Press, Copenhagen)
Heinemeier, K.E., Schiller, G.E. and Benton, C.C. 1990, Task conditioning for the
workplace: issues and challenges. ASHRAE Transactions, 96, 678–688
Humphreys, M.A. and Nicol, J.F. 1996, Confilcting criteria for thermal sensation within the
Fanger predicted mean vote equation. Proceedings of CBBSE/ASHRAE Joint National
Conference, (Harrogate, the UK)
ISO 7730 1994, International Standard 7730, Moderate Thermal Environments:
Determination of the PMV and PPD Indices and Specification of the Conditions for
Thermal Comfort, (ISO, Geneva)
Stolwijk, J.A.J. 1971, A mathematical model of physiological temperature regulation in man,
NASA CR-1855, (NASA, Washington)
Thellier, F., Cordier, A. and Monchoux, F. 1994 The analysis of thermal comfort
requirements through the simulation of an occupied building, Ergonomics, 37,
817–825
Wang, L. 1992, Research on Human Thermal Responses During Thermal Transients., M.Sc.
dissertation, (Tsinghua University, P.R.China)
THE USER-ORIENTED DESIGN, DEVELOPMENT AND EVALUATION OF
THE CLOTHING ENVELOPE OF THERMAL PERFORMANCE .

Damian Bethea and Ken Parsons

Human Thermal Environments Laboratory (HTEL)


Department of Human Sciences
Loughborough University
Loughborough, Leicestershire, LE11 3TU
England, UK
Tel: +44 (0) 1509 22 81 65
email: D.Bethea@lboro.ac.uk K.C.Parsons@lboro.ac.uk

The Clothing Envelope of Thermal Performance (CETP) provides a range of


limits describing an ensemble in terms of wearer comfort and/or safety over
a range of environmental parameters and work rates. This study took the
systems approach for the user-oriented design, development and evaluation
of the CETP. 4 Methods of Presentation (MoP) of the CETP and their
associated documentation were iteratively designed using ergonomics
principles and guidelines. The MoP’s were an Area Graph, a Table, a
Psychrometric Chart and a Decision Tree. Following a pilot trial, 2 MoP’s
were selected for a formal user trial with a representative user population.
The results of the Formal User Trial recommended that the Area Graph be
the Method of Presentation for the CETP. The successful application of the
CETP may help to reduce heat stress amongst wearers of PPC in industry.

Introduction
Personal Protective Clothing (PPC) is selected and allocated on the strength of its hazard
protection qualities, and, as such, is described by manufacturers in terms of its “technical
specification” i.e. permeability, material abrasiveness etc. A consequence of this increased
protection from the hazardous environment is that the clothing may not adequately facilitate
the transfer of heat from the body to the environment, potentially causing heat stress. Health
& safety managers/advisors are presented with little or no information about the effects
protective clothing will have on the wearer in different environmental situations. The concept
of the thermal model of a clothing envelope could be applied which would specify the
clothing ensemble in terms of its thermal performance specification over a range of
environmental parameters Each envelope is specific to an ensemble and for a given type of
work will indicate the range of limits within which human comfort and/or safety will be
achieved. If the ensemble performs within the envelope, the performance is acceptable. This
project was concerned with the user-oriented design, development and evaluation of a
Clothing Envelope of Thermal Performance (CETP). It considered both the generation of
the CETP using current climatic ergonomics knowledge and the use of user-oriented design
Design, development and evaluation of the CETP 521

and evaluation principles and guidelines to ensure that the CETP is practical and usable. The
successful application of the CETP may help to reduce heat stress amongst wearers of PPC in
industry.

Systems design approach


A 4-stage systems approach was developed for the completion of the objectives, although
only the first 3 stages were completed (see Figure 1). This design process was developed to
accommodate the time schedule and resources available.

Figure 1. Flow Process Chart of the design, development and evaluation of the CETP

Stage 1: Synthesis Stage


The Synthesis Stage involved the initial planning and design of the CETP paper interface.
Design objectives were identified following interviews with experienced Health & Safety
Executive (HSE) inspectors and health & safety managers, occupational hygienists, product
managers etc. from industry. The users were then formally defined as those people
responsible for the health and safety PPC policy within their company and will be referred to
as health & safety managers/advisors.
The development of the data characteristics of the CETP were generated using the 2-node
model of human thermoregulation. The model was adapted to provide a combination of all
four of the environmental parameters and the work rate values that would cause an increase in
core temperature to 38°C for a particular ensemble, thereby creating the limits of the
envelope. The following were the specifications used:
522 Damian Bethea and Ken Parsons

Model Specification
The model was rewritten so that the core temperature was set at 38°C, and the intrinsic
clothing insulation at 1.5clo. The skin wettedness was set at 1, which is the maximum skin
wettedness for acclimatised workers and the relative humidity range was increased by 10%
increments from 10% to 100%.

Problem Specification Inputs


The inputs for the model were metabolic rate (from 75–225 W/m2 in increments of 25 W/m2),
air velocity (from 0.2–1 m.s-1 in increments of 0.2 m.s-1) and mean radiant temperature (from
30–90°C increasing by 10°C increments).

Outputs Interpretation
This resulted in 560 combinations of the three inputs, producing a predicted air temperature
for each 10% increment in humidity that results in a rise of core temperature to 38°C after 1
hour of exposure.

Following the generation of the data model from the 2-node model, 4 methods of presenting
the CETP were developed as a paper interface. They were an Area Graph; a Table; a
Decision Tree and a Psychrometric Chart (see Figures 2 and 3.). Parallel to this
development of the MoP’s the associated documentation was iteratively developed
(Information & Instructions.) according to the user requirements, the functional specification
and the specific design of each MoP. Design guidelines such as how to present information in
tables, the use of colour, spatial cues etc. were used in the development of both the MoP
interfaces and their documentation. Methods used included assigning categories to air
velocity (air movement), radiant temperature and work rate such as low, medium, high, and
the use of colour coding for these categories. Definitions of the categories and their ranges
were supplied in the Instructions.

Figure 2. Representation of the Area Graph MoP. Colour coding is not shown and
has been substituted with Letters for the air temperature categories and formatted
lines for the work rates. The arrows show the increase in work rate for each category.
This is the Area Graph for low radiant temperature at low air movement.
Design, development and evaluation of the CETP 523

Figure 3. Representation of the Table MoP. Colour coding is not shown. (Note this is
the Table for Low radiant temperature at Low air movement)

An important aspect of developing the MoP was allocating functions to each of the
parameters as it would have been impossible to represent all the data on a single interface. A
fundamental requirement of this allocation of function was that the user’s decision process,
when using the CETP, had to be identical irrespective of the MoP they were using.

Stage 2: Basic Design


The 4 MoP’s were further developed and evaluated using Heuristic Analysis and Informal
Walk-Throughs. During this stage, the User Trial Pack (UTP) consisting of an Introduction,
Instructions, Scenarios, Questionnaires etc, was developed. Due to the explicit differences
between the MoP’s, the same UTP’s could not be used for all methods. Therefore the MoP’s,
their instructions and the UTP’s for each MoP were analysed to ensure that there were no
confounding errors in the nature of the information provided.

User Trials
User trials were used to evaluate the 4 MoP’s against usability criteria (e.g. ease of use,
performance.) This identifies definitions of user characteristics and accuracy in using the MoP
and its Instructions for use. Objective measures to evaluate differences between the MoP’s were
evaluated by providing the participants with an Example Scenario of a working environment
that needed to be assessed. The environmental parameters and the nature of the work rate were
provided. The participant was then required to interpret the CETP using the associated
documentation and MoP to determine whether the hypothetical clothing represented by the
CETP was suitable for the environment and work described in the Scenario. They were then
provided with answers to the questions so they could evaluate their own performance. Therefore,
the Example Scenario acted as a training tool. They then answered 2 further Scenario Questions.
Likkert scales were used to evaluate the subjective responses to usability criteria, such as
ease of use, accuracy of CETP and appropriateness of information. The results from the
Scenario Questions would provide the only objective data from the User Trial. Subjective
data provided supportive information but objective performance was the main criteria used
for the assessment of the MoP’s.
524 Damian Bethea and Ken Parsons

Stage 3: Summative Evaluation


A pilot study was carried out using 24 student volunteers to assess the usability of the 4
MoP’s in a between-subject design (6 subjects per MoP). From this study the Area Graph and
Table were selected for a formal evaluation by a representative population.

Experiment 2
40 health and safety managers were sent UTP’s by post (20 per MoP). The results of this User
Trial were used to propose a final CETP interface which could be used by those people responsible
for the PPC policy in companies in industry. The independent variable was the MoP (Area
Graph or Table), with the dependent variables being the usability criteria. The objective data
from all 3 Scenarios, the subjective data and personal data were then collated and analysed.

Experiment 2—Results
The results of the Formal User Trial did not provide any significant differences between the
usability criteria of the Table and the Area Graph methods of presentation. Those differences
that were observed are discussed below.
The instructions: No significant difference in the subjective ratings of any of the
usability criteria where observed. Therefore differences were not due to the instructions.
The Scenarios: A training effect following the Example User trial was evident, with the
subjects achieving a higher positive score in the next two Scenarios. Although, the Table
provided a high number of false positives for the air temperature range answers in Scenario 1,
resulting in an erroneous assessment of the ensemble’s suitability.
The Method of Presentation. The Area Graph performed better, both in the Scenarios
and in the subjective ratings for the usability criteria.
Other Issues: The air temperature ranges were too large, because all the parameters that
would result in an increase in core temperature to 38°C were included in the CETP. Most
users complained that the CETP did not tell them what PPC should be worn and therefore it
was not directly addressing their issues.

Conclusions
The use of the 2-node model was satisfactory for the requirements of this project; however, more
sophisticated models should be used if the CETP is be developed further. The Area Graph is the
Method of Presentation that is recommended for the CETP. Although it has been shown that a
user performance specification can be provided and used to allow the selection, procurement and
allocation of appropriate protective clothing, it also needs to be more specific to the environment
in which the PPC is to be worn (e.g. job-task-environment-clothing-specific). At this stage of
development is would appear that the CETP would be best suited to represent PPC in situations
where generic ensembles are worn e.g. the military, the fire brigade, nuclear power industry etc.

Acknowledgements
The authors would like to acknowledge the help of the following people; Len Morris, Dr. Ron
McCaig, Paul Evans. Andrew Phillips from the HSE, Dr. Reg Withey from DERA, Mike
Harris from Loughborough Consultants and Geoff Crockford. Gratitude is also expressed to
those that took part in the study.
A Comparison of the Thermal Comfort of Different
Wheelchair Seating Materials and an Office Chair.

Humphreys, N., Webb, L.H., Parsons, K.C.

Department of Human Sciences,


Loughborough University,
Loughborough,
Leicestershire, LE11 3TU

This study aimed to determine whether there were any thermal comfort
differences between a standard office chair (wool, viscose) and four different
wheelchair seating materials. The materials were tested in a neutral
environment of 23°C, 70% humidity, with a predicted mean vote of 0.
Measurements used included, thermal sensation, thermal comfort,
stickiness, subjects skin temperatures and temperatures of the chair seat pan
and back. Differences were found between the chairs, where the body was in
contact with the seat pan and back. The subjective responses indicated that
on their ‘bottom’ the office chair (wool, viscose) was generally slightly
warmer, less comfortable and more sticky than the wheelchairs. Whilst
Wheelchair D (PVC coated) was slightly cooler than all the other chairs. The
objective temperature data support the subjective data.

Introduction
The thermal environment is one important aspect of achieving a “quality indoor environment
for people”. Much work has been done on the thermal comfort requirements of the general
population. However, until now little research has been carried out on the thermal comfort
requirements of people with physical disabilities. One particularly relevant area of interest is
that of the different seating materials used in wheelchairs. Studies of people without physical
disabilities can control the seating which its subject group uses. However, when an individual
uses a wheelchair, they are specific to that individual, and people can not always transfer to
the desired chair controlled by the experimenter. This study then attempts to look at the
thermal comfort effects of four different wheelchair seating materials compared to that of a
standard office chair. Subjects used in the study were not disabled, as the thermal comfort
requirements (Predicted Mean Vote BS EN ISO 7730 (1995)) of this population is known
therefore, avoiding the uncontrollable variable of the effects of different disabilities. The
effect of these disabilities on thermal comfort requirements being unknown as yet.
526 N Humphreys, LH Webb and KC Parsons

Aim
The aims of this study are two fold. One: To determine whether there are any thermal comfort
differences between a standard office chair (wool, viscose) and wheelchair seating materials.
Two: To determine if there are thermal comfort differences between different seating
materials on wheelchairs.
Hypothesis 1 - Office chairs and wheelchair seating materials do not differ in terms of
thermal comfort.
Hypothesis 2 - Different wheelchair seating materials do not differ in terms of thermal
comfort.

Method
To determine the optimum indoor temperature in which to evaluate the seating material a
number of small studies were undertaken. A two day survey of a day centre for people with
physical disabilities, measured the air temperature, radiant temperature and humidity of the
centre during the hours of occupation. In addition two pilot studies were conducted in a
thermal chamber, one at air temperature (ta) of 23°C and relative humidity (rh) of 70 %,
predicted mean vote=neutral (PMV=0), the other, ta 29°C, 50% rh, predicted mean
vote=slightly warm to warm (PMV=+1.5). The studies suggested that an air temperature of
23°C, relative humidity of 70%, PMV=0, would provide sensitive conditions and reflected
the air temperature of the day centre. These conditions were then used for the main study.

Subjects and Clothing


Five male subjects without physical disabilities participated in the study. Age=22.8±6 years,
height=183±9cm and weight=73±11kg. Each subject wore their own footwear, cotton socks
and underpants. They were provided with trousers and a shirt both of 65% polyester and 35%
cotton and a sweatshirt of 70% cotton 30 % polyester. The estimated clo value was 1.0Clo.

Experimental Design
Table 1 Material Composition of Seats

Four identical wheelchairs were fitted with four different seating materials. The experimenter
was not informed of the materials used and the chairs were labelled A, B, C D and office
chair. See table 1 for the composition of the seating materials. The subjects were exposed to a
different chair on five consecutive mornings, the order of exposure was determined by a 5×5
Latin square. The subjects always sat in the same part of the chamber and the seating moved
around according to the exposure for the session.
Thermal comfort of seating materials 527

Measurements
The experimental protocol was similar to that of Webb and Parsons (1997). Subjective and
objective measures were taken. The ISO/ASHRAE 7 point thermal sensation scale was used
as well as subjective scales for thermal comfort, stickiness, preference, satisfaction and
discomfort of specific seat areas. Both overall and body area responses were recorded. In
addition subjects were asked to rate the chair of the present session in relation to the chair of
the previous session at the end of the three hours.
The skin temperatures of the subjects were taken at their mid anterior and posterior thigh,
chest and lower back. Seat temperatures were taken at the chair back and pan.
The environment was measured across three regions of the chamber, at subject chest level.
Measurements taken were: air temperature, globe temperature, humidity, air velocity, plane
and radiant temperature.
Equipment used included: Grant and Eltek series 1000 and 1001b data loggers with EU
skin thermistors, 150mm diameter black globe, Vasiala HMP 35DGT humidity and
temperature probe, air velocity sensor Byral 8455 and a Brüel and Kjær Indoor Climate
Analyser (Type 1213) with humidity, temperature, air velocity and plane radiant probes.

Procedure
The group of five subjects arrived at the laboratory thirty minutes before the experimental
session commenced. Procedures were explained, consent and health check forms completed.
The four skin thermistors for measuring subject skin temperatures were attached to the
subjects and oral temperatures taken. The standard clothing (of correct size) was given to
each subject. Objective measurements were set to record every minute for the three hours.
Subjective forms were completed prior to entering the chamber, on entering the chamber and
every fifteen minutes thereafter. The subjects sat in an upright but relaxed position in their
chairs, watching light entertainment videos for the duration of the session. Any major
deviations from the posture were corrected.

Results
The target environmental conditions were achieved. See table 2
Table 2 Actual Experimental Conditions

Thermal Sensation
Body areas not in contact with the chair, voted neutral across all five subjects on all five
chairs. Body areas in contact with the chairs, i.e. posterior thigh, bottom and lower back,
showed some variation between the chairs but within one actual mean vote (AMV).
Bottom area for all chairs were close to neutral except for the office chair (wool, viscose)
for which the actual mean vote was generally higher than that of the other chairs scores of
neutral to slightly warm AMV=0–1.

Thermal Comfort
All the chairs were found to be “not uncomfortable”. Scores for the bottom-body area showed
slight variation between the chairs, with the office chair (wool, viscose) showing actual mean
votes of ‘slightly uncomfortable’. However, in general differences between the chairs were marginal.
528 N Humphreys, LH Webb and KC Parsons

Stickiness
In the neutral environment to which subjects and chairs were exposed no stickiness was
experienced. Only the office chair (wool, viscose) showed scores nearing ‘slightly sticky’ in
the bottom-body area, towards the end of the three hours.

Preference
Subjects expressed different preferences between their chairs in terms of their desire to be
warmer or cooler. When seated in Chair A (woven nylon), 20 % of subjects wished to be
cooler, whilst when seated in Chair D (PVC coated) 40% would prefer to be warmer. For the
other three chairs subjects preferred no change to the environment.

Satisfaction
In terms of thermal comfort 40 % of subjects were dissatisfied with Chair A (woven nylon).
Whilst the other four chairs were all found to be satisfactory. When subjects were asked if
they would be satisfied in a particular chair for the whole day 40% said they would be
dissatisfied with Chair A (woven nylon) and the office chair (wool, viscose), and 20 % would
be dissatisfied with Chair D (PVC coated). A rank order of preference was established. The
preferred chair from most preferred to least preferred was Chair C (woven polyester), D
(PVC coated), B (expanded PVC), A (woven nylon) and office chair (wool, viscose).

Skin Temperatures
There were no statistically significant differences between the skin temperatures of the
subjects across the chairs. However, Chair D (PVC coated) was cooler in the posterior thigh
and by 0.4–0.8°C for the wheelchairs and 1.7°C cooler than the office chair (wool, viscose)
and for the lower back 1.1–1.3°C cooler than the other chairs.

Chair Temperatures
There was 3.9°C difference between the coolest and warmest chair back, and chair pan, the
coolest was Chair D (PVC coated) and the warmest was the office chair (wool, viscose).
Tukey’s pairwise comparison showed that Chair D (PVC coated) was significantly cooler at
p=0.05, than the other chairs and the office chair (wool, viscose) was significantly warmer at
p=0.05 than the all the other chairs.

Discussion
When evaluating the different seating materials in a neutral environment, no differences were
found with regards to sensation, comfort and stickiness between the four wheelchair
materials. The four materials were varying combinations of polyester, nylon and PVC, three
of the materials had a PVC foam filling of approximately 2mm and PVC coated polyester
backing, and one did not. The material that showed slight differences was the office chair
(wool, viscose), which had a wool, viscose and nylon weave, a PVC foam filling of
approximately 5cm and a wooden backing. This chair gave results of slightly warm, slightly
uncomfortable and slightly sticky, mainly in the ‘bottom’ area of the body.
The office chair (wool, viscose) was also the lowest ranked chair. It was found to be
satisfactory at any one moment, however, 40% said they would not find it satisfactory to be in
all day. Subjects posterior thigh skin temperatures were 1.7°C warmer than when sat in chair
D (PVC coated) and the seat pan temperature was 3.9°C warmer than chair D (PVC coated)
Thermal comfort of seating materials 529

(which was the coolest chair) The office chair (wool, viscose) was significantly warmer than
all chairs p=0.05.
Chair C (woven polyester) was ranked first in preference. However, there were no
performance indicators other than ranking to differentiate chair C (woven polyester) from
chair B (expanded PVC) which was ranked 3rd. Chair D (PVC coated) was ranked 2nd
despite 40% of subjects wanting to be warmer when in this chair, 20% would not be satisfied
seated in it all day. Subject skin temperatures at posterior thigh and back were cooler by
around 1°C than when in other chairs and the seat pan and back were significantly cooler
p=0.05, than the other chairs. It is not possible to say from the data whether the cooler effect
of this chair was preferred by subjects or what the implications of this would be for warmer
or cooler environments. It was however, ranked 2nd.
Chair A (woven nylon) was ranked 4th. This chair did not differ in performance with
regards to sensation, comfort and stickiness to the other wheelchairs, however, on the
preference ratings, 20% of people wished to be cooler when seated in this chair, 40% were
dissatisfied both at any one time and if they were to sit in the chair all day.

Conclusions
Overall it was found that the office chair (wool, viscose) differed slightly in terms of thermal
comfort compared to that of different wheelchair seating materials. Differences with regards
to preference were also found between the different wheelchair seating materials, no
differences in terms of thermal comfort, sensation and stickiness were found.
Subjective and objective results supported each other, although some slight variation was
found, the differences were negligible and therefore insignificant in a neutral environment.
Chair D (PVC coated) was found to be the coolest chair, and the subjects would have
preferred to be warmer when seated in this chair. However, in ranking the chairs, Chair D
(PVC coated) was ranked second. It can not therefore be assumed that the cooler chair was a
negative attribute, in a neutral environment, however, it may have different implications in
warmer or cooler environments.
Except for the office chair (wool, viscose), it is possible that factors other than the
sensation, comfort and stickiness ratings influence subjects seating material preference.
The warmest chair was the office chair (wool, viscose). The key differing factor between
the coolest, warmest and other chairs appears to be material thickness and foam padding
rather than material type.
Further work needs to be carried out to evaluate the materials in warmer and cooler
environments, to evaluate more closely the effect of material thickness, type and backing on
the thermal comfort of the seating materials.
A further study is now being undertaken, to evaluate the thermal comfort of the same
seating materials in a slightly warm to warm environment predicted mean vote 1.5, air
temperature 29°C, 50% relative humidity.

References
BS EN ISO 7730 (1994): Moderate Thermal Environments—Determination of the PMV and
PPD Indices and Specification of the Conditions for Thermal Comfort. 2nd ed. (ref.
no ISO 7730:1994(E)) International Standards Organisation, Geneva.
Webb and Parsons (1997) Thermal Comfort Requirements for People with Physical
Disabilities. BEPAC and EPSRC Sustainable Building mini conference, UK 5/6
February. http://www.iesd.dmu.ac.uk/bepac/sustbuild/conf/heat_com.htm#
THE EFFECT OF REPEATED EXPOSURE TO EXTREME
HEAT BY FIRE TRAINING OFFICERS

Joanne O.Crawford and Tracey J.Milne

Industrial Ergonomics Group


School of Manufacturing and Mechanical Engineering
University of Birmingham
Edgbaston
Birmingham B15 2TT

The study examined the effect on fire training officers of repeated exposure
to high temperatures. Physiological measures included heart rate and aural
temperature. Sub-maximal baseline tests were carried out with participants
in P.E.Kit and fire kit and oxygen consumption was measured. On entry to
hot conditions, aural temperature and heart rate were monitored. The results
indicated a significant increase in heart rate between the two baseline
conditions. There were no significant differences found between baseline
measures of heart rate and temperature (wearing fire kit) and first entry to
the hot conditions. Heart rate data and temperature data obtained from the
second entry to the hot conditions was found to be significantly higher than
all other conditions. The results indicate a need to further examine working
time and recovery time when working in high temperatures.

Introduction
Firefighting has long been associated with high physiological demands. Many studies both
simulation and actual fire suppression have measured work load at 60–80% VO2max and up to
95% HRmax (Sothmann et al, 1992; Manning & Griggs 1983). Both physiological and
psychological stresses (including boredom and anxiety) contribute to the overall hazards
affecting the well-being of firefighters (Lim et al, 1987). Personal Protective Equipment
(PPE) is essential but when fully kitted up, an additional 20–30kg in weight is added, which
causes a significant increase in oxygen consumption, heart rate and ventilation rates
(Louhevaara (1984); Borghols et al, (1978); Sykes (1993); Love et al, (1994). High
temperatures add an additional physiological load and it is thought that typical temperatures
worked are 38°C to 66°C but air temperatures as high as 232°C in structural fires have been
recorded (Abeles et al, 1973)
Although actual firefighting tasks have been examined, the nature of firefighting is such
that on a day-to-day basis, firefighters are not continually working in extreme environments.
Effect of repeated exposure to extreme heat by fire training officers 531

The role of fire training officers, is however different in that they are training recruits in hot
conditions and are entering fire training houses more than once per day wearing full fire kit
and self-contained breathing apparatus. Although previous research has recommended
maximum limits for core temperature for working in high temperatures, body temperature is
not necessarily monitored on a day-to-day basis.
The aim of this study was to examine the physiological strain on fire training officers
when they were exposed to hot conditions more than once per day by monitoring heart rate
and aural temperature and comparing data collected with baseline data.

Method
Six participants took part in the study, all were professional male fire brigade training
officers. The equipment consisted of a bicycle ergometer (Monark Ergomedic 818),
Aerosport TEEM 100 metabolic analyser, two heart rate monitors (Polar Sports Tester and
the Polar Vantage NV) and two aural thermometers (Grant Instruments, Cambridge). The
aims of the study were explained to all participants.
Baseline data were collected from the participants in two sub-maximal tests; wearing P.E
Kit, full fire kit and carrying SCBA. Each test was sub-maximal and consisted of 4 workloads
on the bicycle ergometer for a period of 5 minutes with 5 minutes break between each workload.
The starting point of each stage was a workload of 20W followed by 3 further workloads with
increments of 20W—the final workload being 80W. Due to time constraints all baseline data
had to be collected on the same day but participants were allowed recovery time between tests.
Data collected during the baseline tests included oxygen uptake, heart rate and aural temperature.
Physiological monitoring used when the training officers were in the fire house included
heart rate monitors and aural thermometers. Aural thermometers placed on the training
officers twenty minutes before recording commenced. Temperature was recorded prior to
entry to the fire house and on exit. Heart rate was recorded continuously during the working
period. Air temperature was continuously monitored in the fire house. The time spent in the
fire house for the participants varied between 10 minutes and 18 minutes for each participant.

Results
The participants were all male fire training officers, age range between 29 and 42 years old.
Average predicted VO2max was found to be 2.88 l/min (range 2.51 to 3.33 l/min).
Heart rate and aural temperature data obtained during the final work phase of baseline
tests are shown in Table 1. The heart rate data and aural temperature data for the hot
conditions are shown in Table 2. The data are presented graphically in Figures 1 and 2 for
each of the four test conditions.
Initial examination of the heart rate data found that the sample were working at 78.8±12%
of maximum heart rate during the second entry to the fire house (range 67% to 99%). The
results are shown in Table 3. Statistical analysis found that when comparing baseline tests,
there was a significant increase in heart rate when wearing fire kit (p<0.02). Using ANOVA
to analyse the heart rate data, it was found that there were significantly higher heart rates in
the second entry to the fire compared with all other conditions (p<0.001). However there
were no significant differences found between wearing fire kit in the baseline condition and
first entry to the fire house.
532 JO Crawford and TJ Milne

Table 1. Baseline Data

Table 2. Data from Hot Conditions

Table 3. Average Heart Rate and Percentage of Predicted Maximum Heart Rate

Statistical analysis found that with the baseline tests there were no significant differences
in aural temperature when wearing PE Kit and Fire Kit. Using ANOVA to analyse the aural
temperatures, significant differences were found again with the highest temperatures after the
second entry to the fire house (p<0.01). It was interesting to note that there were again no
significant differences between the baseline condition wearing fire kit and the first entry to
the fire house.
Environmental temperature monitored during the study ranged from 44°C to 240°C.
Effect of repeated exposure to extreme heat by fire training officers 533

Figure 1. Heart Rate Data

Figure 2. Aural Temperature Data

Discussion
The participants who took part in the study were of a variety of ages and their predicted
VO2max was found to reach the minimum level recommended by Sothmann et al, (1992) for
firefighting. However the use of sub-maximal testing to predict maximum oxygen
consumption does increase the likelihood of errors by ±15%.
The increase in heart rate across conditions supports the majority of previous literature.
What was interesting from the study was the significant increase in both heart rate and aural
temperature between the two hot conditions indicating that physiological strain was
significantly higher on the second entry to the fire house. This appears to indicate a
cumulative effect between the two conditions. The lack of difference in heart rate and aural
temperature between the baseline condition wearing fire kit and first entry to the fire house
would also support a cumulative effect between the two hot conditions. This raises further
questions about repeated exposure in this environment. One participant’s aural temperature
reached 39.3°C on second entry to the fire house. Although aural temperature is not a direct
measure of core body temperature, it does indicate an increase in body temperature. This
534 JO Crawford and TJ Milne

would suggest that there is a need to further examine the time spent in hot conditions and the
amount of time allowed for recovery between entries.
The limitations on this study were having only six participants, the lack of control over the
fire house conditions and time spent within the hot conditions. It is very difficult when using
live fire exercises to control temperature as it is affected by outside weather conditions.
However, to obtain realistic live data for fire training officers, this was one of the main
methods that could be used and does present future challenges in this field.

Future Research
From this study a number of recommendations can be put forward for future research. The
number of participants should be increased. There is also a need for more controlled
conditions in terms of ambient temperature and standardised working time within the hot
conditions. It would also be recommended that skin temperature be monitored as this may be
a better measure to use in this type of environment. The cumulative effect of heat and wearing
PPE should also be examined further in the future.

References
Abeles, F.J., Delvecchio, R.J. and Himel, V.H., 1973, A fire-fighters integrated life
protection system, phase 1, design and performance requirements. Grumman
Aerospace Corp, New York
Borghols, E.A.M., Dresen, M.H.W. and Hollander, A.P., 1978, Influence of heavy weight
carrying on the cardiorespiratory system during exercise. European Journal of
Applied Physiology , 38, 161–169
Lim, C.S., Ong, C.N., and Phoon, W.O., 1987, Work stress of firemen as measured by heart
rate and catchecholamines. Journal of Human Ergology, 16, 209–218.
Louhevaara, V.A., 1984, Physiological effects associated with the use or respiratory
protective devices: a review. Scandinavian Journal of Work, Environment and Health,
10, 275–281.
Love, R.G., Johnstone, J.G.B., Crawford, J., Tesh, K.M., Graveling, R.A., Ritchie, P.J.,
Hutchison, P.A. and Wetherill, G.Z., 1994, Study of the physiological effects of
wearing breathing apparatus. Institute of Occupational Medicine Technical
Memorandum TM/94/05
Manning, J.E. and Griggs, T.R., 1983, Heart rates in fire-fighters using light and heavy
breathing equipment: similar near maximal exertion in response to multiple work
conditions. Journal of Occupational Medicine, 25, 215–218
Sothmann, M., Saupe, K., Jasenhof, D. and Blaney J., 1992, Heart rate responses of fire-
fighters to actual emergencies: implications for cardiorespiratory fitness. Journal of
Occupational Medicme, 34, 27–33
Sykes, K., 1993, Comparison of conventional and light BA cylinders. Fire International,
143, 23–24
The effects of Self-Contained Breathing Apparatus on gas
exchange and heart rate during fire-fighter simulations
Kerry Donovan & Alison McConnell

Sports Medicine and Human Performance Unit,


Applied Physiology Research Group,
School of Sports & Exercise Sciences, University of Birmingham,
Edgbaston, Birmingham B15 2TT.

The aims of the present study were, to quantify the effects of wearing self-
contained breathing apparatus (SCBA) during an occupation-specific
laboratory-based test (‘Firetest’) and to validate the test. 8 fire-fighters and
10 civilians performed the Firetest with and without fire-fighter SCBA. The
.
group mean results with and without SCBA wear were as follows. VO2
.
increased from 1.81 to 2.34 l·min-1 (p<0.01); VCO2 from 1.65 to 2.29 l·min-1
.
(p<0.001); VE from 53.49 to 72.74 l·min-1 (p<0.001), Fc from 123 to 152 bpm
(p<0.001). TTOT fell from 2452 to 1882 msec (p<0.01). Fire-fighters using
this protocol reported that the Firetest was more representative of fire-
fighting activities than other tests in current use. The Firetest will be used to
test the respiratory responses of fire-fighters to SCBA wear and to monitor
the changes in performance following various training interventions.

Introduction
All active fire-fighters are trained to use Self-contained Breathing Apparatus (SCBA). Accurate
determinations of the cost to respiratory function of SCBA wear is therefore desirable. However,
testing of fire-fighters during real emergencies is generally impractical. Most of the published
reports examining fire-fighters’ responses to SCBA wear have utilised standard laboratory
protocols, i.e. treadmill walks and cycle ergometry, (Faff & Tutak, 1989; Wilson et al, 1989).
These are not entirely representative of fire-fighting actions which typically involve whole-
body and upper-body activity, (Sothmann et al, 1991). To address this discrepancy, we have
developed a test which simulates fire-fighting activities in a laboratory environment (Firetest).
Published research indicates that the respiratory accessory muscles are activated at an
earlier stage in exercise when wearing SCBA, (Louhevaara et al, 1995). Furthermore,
evidence suggests that SCBA compresses the thorax, “preventing free and efficient
movement” (Louhevaara et al 1985, pp 215). This restriction, the added mass of the SCBA
and the upper-body work typical of fire-fighting activities may increase respiratory demands
resulting in significant changes in breathing pattern and ventilation.

The aim of the present study was twofold. 1) to validate the Firetest in the light of previous
work carried out in the laboratory environment and in the field, and 2) to quantify the
ventilatory and metabolic effects of wearing full fire-Kit including SCBA (dry mass~23kg),
during the Firetest.
536 KJ Donovan and AK McConnell

Methods
18 male volunteers successfully completed the study, 8 were professional fire-fighters and 10
were civilians, (see Table 1). All were physically active, none were smokers. Ethics
committee approval and informed written consent were obtained prior to the study.

Table 1: Group physical characteristics.

Physiological measurements
Respiratory air flow was measured breath-by-breath using an ultrasonic phase-shift flow
meter, (Flowmetrics, Birmingham, UK), to monitor; inspired and expired tidal volumes, total
breath duration (TTOT), peak inspiratory and expiratory flow rates, mean inspiratory flow rate,
.
respiratory frequency (FR), and minute ventilation (V E). A mass spectrometer analysed
inspired and expired air, (Airspec MGA 2000, UK). Sampling was performed using a fine-
bore polythene catheter (~2.5m long), inserted into the flow meter manifold which was
incorporated into the inlet port of a modified SCBA face-mask.
Heart rate (Fc) was monitored using a Sportstester, (Polar Oy, Finland). Any Fc above 95%
Fcmax would have resulted in termination of the test; no test was terminated under this criterion.
Breathlessness was monitored at one minute intervals using a twelve point modified Borg scale.

Procedure
Visit 1: Anthropometric measures were made, followed by a treadmill test to volitional
. .
fatigue to measure V O2 max. Lung function was measured before and after the VO2 max test.
Failure to meet any of the guidelines laid down by the Home Office (WM Fire, personal
communication), led to the removal of volunteers from the study. The fitness data were used
to calculate the Firescore*, obtained using a modified version of a test designed by Davis &
Dotson (1982). Scores <1065 do not meet requirements, >1065 are “Fair”, >1150 are “Good”
and >1250 “Excellent”. The range of scores for the group was 1079–1336.
Visit 2: Tests of lung function were followed by a 5 minute rest. The volunteers then performed
the Firetest. Heart-rate (Fc) was monitored throughout. Volunteers performed the Firetest in PE
kit and training shoes. Visit 3: Repeated Visit 2 but this time wearing full fire-fighter turn-out kit
including SCBA. Following a debrief, the fire-fighters indicated that the Firetest produced
physiological responses similar to those encountered whilst wearing SCBA.
The Firetest is a nine stage, sub-maximal, progressive test, (see Table 2). Computer
storage limitations required the Firetest to be monitored in two halves, (stage 5 was used to
open a new data file and thus could not be monitored). The treadmill speed and gradient was
kept constant (5 kph, 6%) during each relevant stage.
Effects of self-contained breathing apparatus on gas exchange and heart rate 537

Table 2. The nine-stage task-specific Firetest

Statistical analysis
Data were means of the last ten breaths for the final minute of each monitored stage, (1 data
point per stage per variable). Student’s t-tests for paired observations were performed to test
for differences. Pearson’s product moment coefficient tested for associations between
variables. A probability of 5% was accepted as significant.

Results

Table 3. Results of Visits 2 & 3 Means (±SDev), n=18

The volume of air used during SCB A wear is of great relevance to fire-fighters and was
calculated by summing the expired tidal volume per breath (litres, BTPS). Results of a t-test
for paired data showed that air use between visits increased significantly from a mean of
11521 (±135), to 16011 (±247, n=18, p<0.01), a mean SCBA cost of 39% (range 32 to 51%).
These data show that most of the volunteers would have remained within safe working limits
(~18501), none would have exhausted the air supply (~22501).
A Pearson’s product moment correlation coefficient indicated a significant negative
association between a volunteer’s Firescore and the cardiac strain (82% Fcmax) imposed by
SCBA, (r=-0.59, p<0.025). There was also a significant negative correlation between the
.
Firescore and the mean aerobic strain (54% V O2 max) during Visit 3, (r=-0.67, P<0.005).
538 KJ Donovan and AK McConnell

Three volunteers who successfully completed Visit 1 failed to complete the Firetest during
visit 3. They cited exhaustion, light-headedness and/or upper-body fatigue as reasons for
early termination. The mean Borg score at test termination for these three volunteers
indicated “severe-to-very severe’ breathlessness.

Discussion
An early study by Louhevaara et al, (1985) showed that wearing SCBA increased fire-
.
fighters’ VO2 during progressive treadmill walks by ~0.54 lmin-1. The present study indicates
.
a mean SCBA-induced increase in VO2 of 0.53 lmin-1. These data also suggest that changes in
respiratory demand generally exceeded 25%, (see Table 3), greater than has been reported
elsewhere, (Louvevaara et al, 1986). These results support the rationale for the Firetest,
namely that whole-body and upper-body work increase the loading on the respiratory system
and elicit greater respiratory demands than elicited by the mass of the SCBA alone.
.
Although VT increased significantly between tests, the difference was modest compared
with the differences seen in the other variables (5% rather than>25%). The small increases in
.
VT following Visit 3 cannot be explained by the normal exercise dynamics, as the ‘levelling
.
off’ point during the maximal test was 3.25 lmin-1 (58% of mean FVC). The mean VT seen
.
during Visit 3 was only 2.17 lmin-1 (38% FVC). The majority of the increase inV E can
therefore be attributed to decreased TTOT and the concomitant increase in FR (–36% and
+29%, respectively). This hypothesis is consistent with the non-significant changes in
. .
ventilatory equivalent for O2 (VE/V O2) between the two conditions. There was a slight but
significant increases in duty cycle (TI/TTOT), suggestive of an increased inspiratory demand.
.
These results show that SCBA may limit V T , causing the increased metabolic demands to be
met primarily by increases in FR and changes in duty cycle. This breathing pattern is
inefficient, decreases effective alveolar ventilation and may result in premature respiratory
fatigue.
A study which examined the effects industrial respirators (incl. SCBA), suggested that
.
individuals with VO2 max>50ml.kg.min-1 have the greatest chance to “override the effect of
respirator work on performance,” Wilson et al, (1989, p92). However, the three subjects who
.
failed to complete Visit 3 had a mean V O2 max of ~49.6 50 ml.kg.min-1. The Wilson et al,
(1989) study used standard laboratory protocols (maximal treadmill and cycle ergometer
tests), which did not necessarily generate fire-fighter specific respiratory demand. It follows
.
that V O2 max (ml.kg.min-1) may not be the most important determinant of SCBA tolerance
during standard fire-fighter tasks.
Although there were significant negative associations between the Firescore and both
.
aerobic strain (% V O2 max) and cardiac strain (% Fcmax), only 44% and 34% respectively, of
the variation could be accounted for by the variation in Firescore. This may be the result of
the relatively well-trained status of the group as a whole. It is expected that a more
heterogeneous group would demonstrate wider variations in Firescores and a greater negative
association between Firetest score and cardiac and aerobic strain. This was partially
demonstrated by the three volunteers who failed to complete Visit 3. Their mean cardiac
.
strain and aerobic strain (96% Fcmax and 59% V O2 max respectively), were higher at
termination than the means for the group during Visit 3. Further studies utilising the Firetest
will attempt to test this hypothesis by assessing the responses of less well trained volunteers.
Effects of self-contained breathing apparatus on gas exchange and heart rate 539

Research into load carriage and walking indicates that the addition of a heavy backpack
(or SCBA), forces walkers to lean forwards, Gordon et al, (1989). Such alterations to normal
gait are resisted by eccentric and isometric contraction of various muscle groups including
those of the lower back and the abdominals. As the abdominal muscles are active during
forced expiration, this may have knock-on effects upon respiration leading to respiratory
fatigue. Likewise, isometric contraction of the shoulders, upper chest and upper-limbs may
also impact negatively on respiratory muscles. Furthermore, the mass of the SCBA and the
aforementioned inspiratory restriction may exacerbate respiratory strain. The effects that
such potential respiratory fatigue has upon respiratory function will be the object of future
investigation.
The fire-fighters in the present study reported that the Firetest was more representative of
SCBA activities than other tests in current use and elicited responses similar to those
generated during fire-fighting activities that require SCBA. This protocol will be used to test
the respiratory responses of fire-fighters to SCBA wear and to monitor the changes in
performance following various training interventions.

Summary
The results of the present study show that most of the volunteers were able to cope well with the
added strain of the SCBA, indicating that the physiological characteristics of this group seem to
be reasonable for fire-fighters, supporting the recent findings of Louhevaara et al, (1995).
We have quantified the effects of wearing SCBA during the Firetest and we are confident
that the Firetest is a useful tool for investigating the effects of SCBA on fire-fighter performance,
ventilation and breathing pattern in the laboratory environment. This was supported by the fact
that three volunteers who successfully completed Visit 1 and thus had met the Home Office
minimum requirements for UK fire-fighters, failed to complete the Firetest in Visit 3.

References
Davis P., Dotson C., & Santa Maria D., (1982)., “Relationship between simulated fire-
fighting tasks and physical performance measures,” Medical Science of Sport &
Exercise, 14, 67–71.
Faff J., & Tutak T., (1989), “Physiological responses to working with fire-fighting equipment
in the heat in relation to subjective fatigue,” Ergonomics, 32, 629–638.
Gordon M., Goslin Br., Graham T., et al., (1983), “Comparison between load carriage and
grade walking on a treadmill,” Ergonomics, 26, 289–298.
Louhevaara V., Ilmarenen R., Griefahn B., Kunemund C., & Makinen H., (1995), “Maximal
physical work performance with European standard based fire-protective clothing
system and equipment in relation to individual characteristics,” Eur J Appl Physiol, 71,
223–229.
Louhevaara V., Smolander J., Tuomi T., Korhonen O., & Jaakkola J., (1985)., “Effects of an
SCBA on Breathing Pattern, Gas Exchange and Heart Rate during Exercise,” J. Occ
Med., 27, 213–216.
Sothmann M., Saupe K Raven P., et al., (1991)., “Oxygen consumption during fire
suppression: error of heart rate estimation,” Ergonomics, 34, 1469–1474.
Wilson J., Raven P., Morgan W., Zinkgraf S., Garmon R., & Jackson A., (1989)., “Effects of
pressure-demand respirator wear on physiological and perceptual variables during
progressive exercise,” Am. Ind. Hyg. Assn. J., 50, 85–94–4.
THE EFFECT OF EXTERNAL AIR SPEED
ON THE CLOTHING VENTILATION INDEX

Lisa Bouskill1, Ruth Livingston1, Ken Parsons1 and W R Withey 2

1 Department of Human Sciences


Loughborough University LE11 3TU

2 Centre for Human Sciences

Defence Evaluation and Research Agency


Farnborough GU14 0LX

To quantify the importance of external air movement on the heat exchange


through clothing, 9 male subjects, wearing a wind-resistant GoreTex™ suit,
were exposed twice to an environment of and rh =62 (1SD=1) %. One exposure
had still air, ; during the other, air speed was 3.06 (0.04) ms-1. The Ventilation
Index (VI) of the suit was measured while subjects performed 3 physical
activities: standing stationary, stepping onto and off a raised platform, and
rotating limbs. High air speed increased VI by 37%, 53% and 52% respectively
for the 3 activities (P<0.01). These data imply that wind can induce large sensible
and evaporative heat losses, even in wind-resistant clothing.

Introduction
An increase in the movement of air through a clothing ensemble (often known as the
‘pumping’ or ‘bellows’ effect) reduces both its thermal insulation and its apparent
evaporative resistance, generally resulting in a higher heat loss from the skin than expected.
This may be advantageous when working in circumstances which require heat loss to
maintain a safe deep-body temperature (eg in a hot work-place, or when wearing
encapsulating clothing); however, it may be disadvantageous when circumstances require
heat conservation. Thus, it has been suggested that: “…the measurement and control of [the
pumping effect] seem to be the key to further advances in the field of thermal insulation of
clothing ensembles…” (Vogt et al, 1983).

One of the factors that, in principle, can affect clothing ventilation is the speed of external air
movement (wind). Several mechanisms may operate. Air may enter garments through the
fabrics, depending upon their air permeability, on the condition of the seams and on garment
age and general condition. Air may also move through openings (sleeves, cuffs, gaps around
the collar) and pre-designed vents. The presence of restrictions such as a belt or harness, and
a tighter fit of the clothing may reduce the magnitude of these effects.
Effect of external air speed on the clothing Ventilation Index 541

It is therefore clear that for a complete evaluation of the worker and his thermal
environment it is essential to quantify the relationship between wind speed and the
consequent changes in heat exchange between the skin and the external environment. The aim
of this preliminary study was to examine the effect of wind on the amount of air moving
through a clothing ensemble, as measured by the clothing Ventilation Index (VI) (Birnbaum
and Crockford, 1978). The null hypothesis (H0) was that, in the particular clothing ensemble
chosen for the study, increased air speed would have no effect on the VI. The alternative
hypothesis (H1) was that increased air speed would increase VI.

Materials and Methods

Subjects
Nine, healthy, physically-active males, age range 19 to 30 years, volunteered to participate in the
study. They were fully informed of the objectives, procedures and possible hazards of the study
and completed a Form of Consent before exposure. Left aural temperature and heart rate were
recorded as safety measures. Physical characteristics of the subjects were: Height=1.81 (1
SD=0.06)m; Weight=78.46 (9.13)kg; Dubois Surface Area=1.99 (0.13)m2; Age=22.56 (3.17) years.

Test Protocol
Each subject was exposed twice (once in each air speed), in a controlled-environment
chamber, to the following thermal conditions: ta=tr=5.0 (0.3) °C, and rh= 62 (1) %. In the
‘still air’ exposure, air speed (Va) at chest height was 0.12 (0.02)ms-1; in the second exposure
air speed was increased to 3.06 (0.73)ms-1 using a Micromark 16-inch diameter, pedestal fan,
positioned 0.75m in front of the subject. The order in which subjects were exposed to the low
and high speeds was randomised.

Physical Activities
During each exposure subjects performed 3 activities:

• Standing stationary: with feet together, arms by sides;


• Stepping onto and off a platform 150mm high, each step movement took 1.2s, as cued by a
metronome;
• Rotating each limb individually in large arcs, each arc taking 4.8s to complete as cued by
a metronome.

VI was determined 3 times in succession during each activity, (the method is described
below) from which an average value was calculated. The 3 order of presentation of activities
was balanced between subjects.

Clothing ensemble
Subjects wore: underpants (own), short socks, soft ‘trainers’ and a 2-piece suit comprising
trousers with an elasticated waist and a zip-fronted, long-sleeved jacket with elasticated
bottom hem. The suit was a commercially-available item of leisurewear made from
GoreTex™ material, with cotton elasticated wrist cuffs, and popper fastenings at the ankles.
The jacket zip was protected with a popper-fastened wind baffle. During the study the wind-
baffle on the jacket and the ankle poppers were fastened. Each subject wore a size of suit
appropriate to his stature and physique.
542 Lisa Bouskill1, Ruth Livingston1, Ken Parsons 1 and W R Withey

This type of suit was chosen because GoreTex™ fabric has a very low air permeability.
The effect of wind on the clothing ventilation would therefore be caused mainly by pumping,
and little influenced by the permeation of air through the fabric. Thus, the combination of
relatively air-impermeable garments and closed neck, cuffs and ankles, represents a clothing
condition in which low values of VI would be expected and therefore any effect of external air
movement on VI should be maximised.

Determination of the Ventilation Index.


VI is defined as the product of the rate at which the air within the clothing is exchanged, and
the volume of the micro-environment available for this exchange, ie

VI (litres per minute)=Air exchange (rate per minute)×Micro-environment volume (litres)

Air exchange rate and micro-environment volume were measured as follows (Bouskill et
al, 1997):

Air exchange rate: The rate of air exchange between the micro-environment (ie the space
between the skin and the suit) and the external environment was measured using a tracer gas
technique. Nitrogen was flushed through the micro-environment using a system of
distribution tubes worn next to the skin. The gas in the micro-environment was sampled using
a separate system of tubes, also worn next to the skin, connected to a vacuum pump and an
oxygen analyser. The time taken for the oxygen concentration of the micro-environment gas
to return to 19% from 10% was used to calculate the air exchange rate. This determination
was repeated 3 times for each physical activity, in both air speeds.

Micro-environment volume: The volume of air trapped within the ensemble was determined
in triplicate in a separate session. For this measurement the subject donned over the clothing
ensemble a l-piece, air-impermeable oversuit sealed at the neck, which enclosed the whole
body including the hands and feet. Air was evacuated from this oversuit until the micro-
environment pressure (measured on a water-filled manometer attached to a perforated tube
placed in one trouser leg) began to change. This was taken to be the point at which the air-
permeable oversuit lay just on top of the clothing ensemble. Evacuation continued until no
more air could be removed from the clothing. This additional volume of evacuated air was
taken to be the micro-environment volume.

Results
The mean (and 1 SD) values of VI measured for all 9 subjects in each of the 3 physical
activities, and the 2 air-speed conditions, are given in Table 1.

Table 1: Mean Ventilation Index in 3 physical activities and 2 air-speed conditions (for
details see test)
Effect of external air speed on the clothing Ventilation Index 543

In both still air and high air-speed conditions, physical activity increased the VI. In still air
the increases were 52% and 79% for the stepping and rotating limbs activities, respectively;
in the high air speed the corresponding increases were 70% and 98%.

For all activities, the mean VI was greater in the high air speed condition than in the still air
condition, the differences being 37%, 53% and 52% for the standing, stepping and rotating
limbs activities, respectively. Preliminary statistical analysis using Student’s t-tests shows
that all these differences were statistically significant (P<0.01).

To show the consistency of the effects observed, the values for individual subjects are shown
in Figure 1. As expected, each subject had a unique value for VI in each physical activity.
Previous experience (Bouskill et al, 1997 and Bouskill et al, 1998) has shown that this value
is a function of clothing fit, and the type of movement involved. Therefore comparisons of the
VI values between subjects adds no useful information.

Figure 1: Ventilation Index in the 9 individual subjects


Values are given for 3 physical activities and 2 air-speed conditions
1, 3 and 5 are the ‘still’ air condition; 2, 4 and 6 the ‘high’ air speed condition

Discussion
This study examined the effect of increasing external air velocity on the clothing ventilation
index, and is part of a wider study of factors which affect sensible and evaporative heat loss
from the clothed worker. This information can be used for a variety of practical purposes, for
example, to set safe exposure times for hot or cold conditions, or to calculate effective work/
rest schedules to maintain worker productivity and efficiency.

Some calculations, for example, BS EN ISO 12515, assume that the values of the sensible
and evaporative resistances required in the calculations can be the ‘intrinsic’ values measured
in laboratory conditions often using thermal manikins. However, several studies have shown
that the values obtained in working conditions—the ‘resultant’ values—can be lower (see
544 Lisa Bouskill1, Ruth Livingston1, Ken Parsons 1 and W R Withey

Havenith et al, 1990). Part of this difference can be explained by the increased convective
heat loss caused by air movement through the clothing as a result of wearer movement and
‘wind’ speed. Although there have been studies to determine an empirical relationship
between intrinsic and resultant resistances (Havenith et al, 1990), we know of no studies
which have attempted to quantify systematically the factors that comprise the overall effect,
eg air flow through the clothing layers. The VI is a measure of air flowing through the micro-
environment; it therefore has the potential to quantify the convective heat loss this will
induce.

The VI values measured in the different physical activities in this study are typical of those
obtained in a previous study (Bouskill et al, 1997), and are reproducible—the difference
between the triplicate measures in the present study being less than 10%. The consistent
effect of the increased air speed shows that the amount of air moving through the micro-
environment of this clothing ensemble was sensitive to the air speed. GoreTex™ fabric has a
very low air permeability, so most of the wind-induced increase must have taken place
through the openings in the suit, even though these were in the ‘closed’ position. The
mechanism for this may be direct displacement of the micro-environment air, or convective
exchange arising from the wind creating a duct effect at garment openings.

From our data it can be concluded that wind has a significant effect on clothing ventilation,
and therefore on sensible and evaporative heat transfer between the clothed worker and the
external environment, even in wind-resistant clothing. This affects thermal strain, and should
be considered in assessments of the worker’s thermal environment.

References
Birnbaum, R.R. and Crockford, G.W. 1978, Measurement of clothing ventilation index.
Applied Ergonomics, 9, 194–200.
Bouskill, L.M., Withey, W.R., Watson-Hopkinson, I. and Parsons, K.C. 1997, The
relationship between the clothing ventilation index and its physiological effects. In:
R.Nielsen and C.Borg (eds) Proceedings of Fifth Scandinavian Symposium on
Protective Clothing. Elsinore, Denmark. 5–8 May 1997, 36–40.
Bouskill, L.M., Sheldon, N., Parsons, K.C. and Withey, W.R. 1998, The effect of clothing fit
on the ventilation index. In M.A.Hanson (ed) Contemporary Ergonomics 1998.
(Taylor and Francis, London).
British Standards Institution, 1997. Hot environments—Analytical determination and
interpretation of thermal stress using calculation of required sweat rate. BS EN ISO
12515:1997.
Havenith, G., Heus, R. and Lotens, W.A. 1990, Resultant clothing insulation: a function of
body movement, posture, wind, clothing fit and ensemble thickness. Ergonomics, 33,
67–84.
Vogt, J.J., Meyer. J.P., Candas, V., Libert. J.P. and Sagot, J.C. 1983, Pumping effect on
thermal insulation of clothing worn by human subjects. Ergonomics, 26, 963–974.

Acknowledgement
Support from the DERA Centre for Human Sciences is acknowledged
© British Crown Copyright 1998/DERA.
Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office.
COMMUNICATING
ERGONOMICS
Commercial Planning and Ergonomics

Jane Dillon

RM Consulting
Royal Mail Technology Centre
Wheatstone Road, Dorcan
SWINDON SN3 4RD

The Human & Environmental Consultancy is a newly formed group within


RM Consulting. It provides three high level products: ergonomics
consultancy, safety consultancy and environmental consultancy. It was
perceived that the commercial success of the new unit would, to significant
extent, depend on a clear marketing strategy. This perception was reinforced
by an increasingly complex and competitive marketplace and the speed of
technological change. The commercial planning process would enable the
creation of marketing objectives and the identification of resources and
financial investment needed to achieve them. This would ultimately lead to
greater profitability and growth. The paper describes how the principles of
marketing theory were applied to the activities of a specialised consultancy
unit and discusses the impact of the commercial planning process in
determining the direction and focus of the group’s activities.

Introduction
The Human & Environmental Consultancy is a separate commercial unit within RM
Consulting, one of the main UK consultancies dealing in postal and distribution services. The
group was formed in April 1997 following a major reorganisation of RM Consulting and
brought together three existing groups working in the fields of Safety, Ergonomics and
Environment. The commercial climate in which the group operates is increasingly
competitive and constantly changing. It was recognised that a clear marketing strategy could
play an important role in shaping future success.

There were initially a number of barriers to the marketing process. The main problem was a
lack of basic marketing and planning skills within the group. This was coupled with a
knowledge that the resources required to produce the plan would interfere with short term
work and current financial performance. Nevertheless, the ever increasing commercial
pressures meant that the status quo was not an option. RM Consulting itself was committed to
strategic market planning and had created a Marketing Consultancy within the organisation.
The production of a commercial plan became a required management activity. The process
was facilitated by the Marketing Consultancy who provided the model for the plan and
important support throughout.
Commercial planning and ergonomics 547

This paper discusses in broad outline the elements that made up the planning process and
the implications for a small consultancy group. It does not address the theory of marketing
planning or the relative merits of different planning techniques.

Purpose and Benefits


The overall purpose of undertaking commercial planning was to identify and create a
sustainable competitive advantage for the group in line with RM Consulting’s business plan.
The process involved a logical sequence of activities leading to the setting of marketing
objectives and the formulation of plans for achieving them. The benefits of commercial
planning were perceived to be increased profitability and improved productivity due to:-

• identification of opportunities and threats


• specification of sustainable competitive advantage
• preparedness to meet change
• improved communication
• better resource allocation
• more market-focused activity

Method
The methods used to produce the commercial plan were:-

• group discussions
• information gathering and analysis
• brainstorming
• discussions with key customers

Elements of the Commercial Plan


The commercial plan consisted of four main elements:- Purpose, Situation Analysis, Strategy
Formulation and Resource Allocation and Monitoring. These are described below.

Purpose
This requires the production of a purpose statement which clearly sets out the role of the
group, its business and its distinctive competence. It is produced mainly through group
discussion. The statement is no more than a page long.

Situation analysis
This section is concerned with understanding the current position of the group in relation to a
number of key areas which are described below.

i) Environmental Review
This reviews elements in the business environment (political, economic, social legislative,
technological etc.) over which the group has no control and helps to identify opportunities
and threats.
548 J Dillon

ii) Market Analysis,


The objective is to analyse all aspects of the current and potential customer base. This
helps to establish the most important products and customers; whether there is over
dependence on a particular product or customer; which products or projects are coming to the
end of their life and how any lost income might be replaced.

iii) Competitive Analysis,


The objective is to identify competitors’ main strengths and weaknesses and assess
current and future impact on the group.

iv) Key Capabilities Analysis


The generic skills (e.g. leadership, project management, report writing etc.) and technical
skills which are required to deliver products are mapped. The output is an understanding of
the capabilities needed within the team and where shortfalls exist. Capabilities which are
under-utilised are highlighted.

v) SWOT Summary
This acts as a summary of the key Strengths, Weaknesses, Opportunities and Threats
(SWOT) which were highlighted during the previous activities. It provides a snapshot view of
the group’s current business position on which to build objectives and action plans.

vi) Creation of Assumptions


The final activity in the Situation analysis is to make assumptions about the factors over
which the group has no control. The main inputs are the results of the environmental review
and the SWOT summary.

Strategy Formulation
This is a key part of the planning process which is designed to examine the group’s relative
competitive position and the attractiveness of different product areas. Products are compared
with the objective of choosing which have the greatest potential for future development. A
number of techniques are available to assist this analysis. The Directional Policy Matrix
(DPM) was selected as the most appropriate. The position of a product on the matrix is
determined by scoring and weighting each product based on Critical Success Factors (CSFs)
and Market Attractiveness Factors (MAFs).

CSFs are the key things which need to be right in order to succeed in a specific market e.g.,
value for money or knowledge of organisation. CSFs are generated by brainstorming all the
success factors that the group feel arc important to customers. The five most important are
then identified and given a weighting factor. Each product is then scored for the group and
contrasted with the group’s main competitor for the product.
A MAF is an inherent characteristic of a market. It should not reflect the position of a
product in a market but should represent the market forces e.g., competitiveness, market
growth or threat of new entrants. Each MAF is given a weighting and each product is scored
against each MAF.
Commercial planning and ergonomics 549

The results are used to produce a matrix with a bubble representing each product (see
Figure 1). The diameter of each bubble represents the amount of net income and the position
of the bubble shows the relative competitive position and the attractiveness of different
product areas.

Figure 1. Directional Policy Matrix for Group’s Products (P1–P3)

Table 1. Subject physical characteristics. Mean (1SD)

The DPM analysis provides the main basis for creating marketing objectives for the
group. Figure 2 shows suggested market strategies for different positions on the matrix.

Figure 2. Strategies for different positioning on the Directional Policy Matrix

Appropriate marketing objectives for the data in Figure 1 would be to invest and grow
market share for products 1 and 2 whilst maintaining current market share for product 3.
550 J Dillon

Resource allocation and monitoring


The final stage of the plan is to:

• plan the effective deployment of the group’s capabilities


• identify the people required to achieve the objectives formulated above
• assess the physical resources needed
• create a detailed finance plan
• prepare an action which details how the previously identified objectives can be met.

Discussion
In evaluating the success of the commercial planning process, it is necessary to evaluate the
costs involved in producing the plan and the benefits which have been achieved as a result.

The creation of the plan involved a considerable amount of work, probably the equivalent of
at least 4 man weeks in staff costs. Additional support from the marketing team was required,
particularly for the DPM analysis. It is unlikely that an effective plan could have been
produced without such professional support. However about 50% of the work involved in
producing the plan would have been done in some form as part of normal management
activity; e.g. finance planning, resource planning, capability analysis.

It is difficult, and probably too soon to quantify the commercial benefits which have resulted
directly from the commercial planning process. Many of the objectives in the plan have
certainly been achieved, e.g. broadening the customer base and increasing market share for
specific products. However it could be argued that these would have been achieved in any
event. The main benefit to the group has been to increase the level of commercial awareness
in the group. Everyone in the team was involved in some part of the planning process.

• Clear marketing actions were identified


• Support for internal investment was secured
• Training needs to support the plan were identified
• Some activities were dropped
• Communication with top management was facilitated.

Conclusion
On balance, the commercial planning process was successful, although the full benefits are
difficult to quantify at this stage. It forced a rigorous self-examination and yielded some
useful and surprising results. It demonstrated that a specialist consultancy like any
commercial enterprise, must understand its markets and invest to succeed.

References
Baker, M.J. 1992, Marketing Strategy Management (Macmillan, Basingstoke)
RM Consulting 1997, The Guide to RM Consulting Processes (Internal Document)
HUMAN FACTORS AND DESIGN:
BRIDGING THE COMMUNICATION GAP

Alastair S.Macdonald
Course Leader, Product Design Engineering, Glasgow School of Art,
167 Renfrew Street, Glasgow G3 6RQ, Scotland

Patrick W.Jordan
Senior Human Factors Specialist, Philips Design,
Building W, Damsterdiep 267,
PO Box 225, 9700 AE Groningen, The Netherlands

Poor communication can arise between different professions due to the


limitations in the language each specialism employs. Despite the growth of
expertise in the fields of product design and human factors, the
communication gap between them can still result in under-optimised
products failing to deliver in usability, quality and enjoyment is use. The
ambition of this paper is to examine the emergence of new tools which
facilitate the development of a common language between the professions.
In particular, the paper will highlight communication techniques based on
the use of images drawn from popular culture as a means of expressing
design issues and people’s rational emotional and hedonic responses to
design.

Limitations of Language
As Wittgenstein wrote in his Tractatus in 1922, ‘Die Grenzen meiner Sprache bedeuten die
Grenzen meiner Welt.’ Translated into English, the sentence reads: ‘The limits of my
language indicate the limits of my world’ and summarises well the difficulty of
communication between different professions. Despite the growth of expertise in the fields of
product design and human factors, many products still betray a poor understanding and
consideration of the end-user through their lack of usability, quality and enjoyment in use.
Consultation with industry-based human factors specialists and designers suggests that a
major reason for this is poor communication between human factors and design
professionals.

The two fields of product design and human factors share areas of concern. Both are user-
centred. Both are concerned with how products, tasks, and environments ‘fit’ people. Design
is a visually-orientated, artistic profession. The designer tends to be a generalist, combining
aesthetic, ergonomic and technological elements to produce an improved or innovative
product. The designer may utilise many forms of specialist knowledge during the design
process to help bring a product ‘into being’.

By comparison, many human factors specialists see their own discipline as a science. The
ergonomist tends to be more of a specialist in one area of his/her field, assisting the designer
with particular expertise or data during the design process, e.g. at the outset with standards
and anthropometric data, and later with user trials.
552 AS Macdonald and PW Jordan

Whilst designers may prefer visual communication aids, many human factors specialists
may favour communicating via technical reports. This sort of reporting can be seen by
designers as dull, produced in a difficult-to-use format, and more suited to a university
laboratory than a commercial design studio. The language which ergonomics uses has
limited the extent to which it can communicate its own field effectively to others. On the
other hand, visual tools can help orient human factors material to suit designers’ strengths as
well as assisting all those in product development teams facing difficult qualitative value
judgments.

Improving communication: understanding designers


How could ergonomists and designers communicate better with one another? Firstly, it would
be useful to understand the nature of design and how designers operate:

1) Designers are practitioners. Relevant knowledge and expertise is applied as required


during the design process. Good designers usually assemble teams of specialist expertise. A
good product design education provides students with a project-led studio ethos, mirroring
the way that designers work in the ‘real world’ of practice.

2) Design practice is increasingly focused on the consumer. A designer’s remit is usually to


come up with a usable, attractive, safe and commercially successful design within a given
time scale and budget. This is a pressured activity where all knowledge is ultimately
embodied in designs—in drawings, models, and prototypes.

3) In a world dominated by legislation and documents, ‘linguistic’ skills are commonly


regarded as the most important, but in the world of practice, spatial and visual skills
are highly developed in eg designers, surgeons and engineers. Designers also tend to be
visually literate (visualate), and respond well to visually-orientated learning and reference
material.

4) Design is seldom a precise science. Design processes often contain so-called ‘fuzzy’
problems with no readily identifiable single correct solution. As a result, designers tend to be
speculative, to have the ability to progress an idea without knowing all the facts—using a type
of ‘fuzzy-logic’, to develop and preserve the ‘sense’ of a product, and they become adept at
making value judgments.

Improving communication: developing new tools


This view of designers suggests the need for ‘bridging tools’ to facilitate a common language
between the professions. Because of the timescales within which designers have to work,
they have developed a number of quick, visual, ‘desk-top’ methods for sharing their ideas.
Examples of two such tools are discussed here, one used in the educational and the other in
the professional context.

Focus Boards
At Glasgow School of Art, the Product Design Engineering course is multi-disciplinary in
nature and involves engineering, product design, and human factors specialists teaching
together in the design studio with students who will emerge as fully qualified engineers.

As a profession, engineers have traditionally had a poor reputation for visual presentation and
communication skills, but the Glasgow model of educating them has demonstrated that
engineering students can easily acquire and develop an attractive ‘skills set’ including visual
Human factors and design: bridging the communication gap 553

communication skills—normally the domain of the product designer. This set of skills is
regarded as a valuable asset by industry. Focus Boards (FBs) comprise one of a number of
visual methods used in studio to develop product concepts and details.

Using images generated from a number of sources—magazines, catalogues, hand-drawn


sketches, diagrams, and photographs together with key words, the FBs help develop a sense
of the desired qualities in the end product, to locate and focus on the market in terms of
customer profile and lifestyle, and to develop a visual reminder and understanding of the
environment and context of use of the product (Table 1).

One area of the FBs introduces the idea of ‘parallel products’ which discuss qualities,
features and details in terms of existing and familiar products already displaying some of the
desired qualities. Through careful editing during their genesis—rather like working on a
number of drafts in a script—students learn to give order and coherence to the boards and to
discriminate quite clearly the particular qualities for carefully targetted end-users. Unlike a
typescript, concept qualities are illustrated by tangible and accessible-to-all examples.

These FBs ensure that all members of the teaching/learning team—designer, ergonomist,
engineer and student alike, are drawing from clear focussed examples. This ensures that the
engineering is human-centred and that a shared vision of the product is developed.

Table 1. A typical Focus Board schematic

Figure 1. An example of a student Focus Board in use

Figure 1 shows a student visualising the requirements for an on-board heater for an
emergency life raft through a Focus Board.
554 AS Macdonald and PW Jordan

Visual Communication Boards


The ‘Humanware’ function at the Groningen studio of Philips Design is responsible for the
input of a number of disciplines into the creation process for a huge variety of household
products from kettles to sunbeds. These disciplines include social science disciplines, such as
sociology, anthropology and psychology; specialist design disciplines, such as interaction
design, trend analysis and multimedia design; and human factors. As part of a design
department, Humanware works hand in hand with product designers. However, it is also a
bridging competence, between design and other disciplines involved in the product creation
process, such as market research, marketing, engineering and product management.

“…its a bit like Shakespeare really: it all sounds very good, but it doesn’t actually mean
anything.” This quote from P.G.Wodehouse character Bertie Wooster might be seen as a
reflection on the way that different disciplines view verbal and written communications from
others. The multidisciplinary nature of Humanware itself, and the involvement of the function
with other disciplines, presents a challenge in terms of communication. Although written
reports have a place in this communication, there are disadvantages associated with them in
this context of use. Different disciplines use different jargon in communication and concepts
that are understood by some disciplines are not meaningful to others. For example, talk of
concepts such as ‘consistency’, ‘compatibility’ and ‘visual clarity’, whilst meaningful to
human factors specialists, may mean very little to the other disciplines in the product creation
process. It is equally unlikely that practitioners of these disciplines will want to slog through
pages of text in order to be educated as to the meaning of such concepts!

Because of these potential communication difficulties, and because design is a visually


oriented profession, the majority of Humanware’s communication is conducted visually. The
main tool used for this is the Visual Communication Board (VCB). VCBs employ images to
communicate concepts and the relationship between concepts and the symbols, product
semantics and form language by which those concepts are embodied. Because the images are
drawn from popular culture and are readily recognised by all, they facilitate an effective
discussion by all involved in the product creation process.

Figure 2. Exploring a Figure 3. A typical


‘masculine’ image ‘Four Pleasure’ board
Human factors and design: bridging the communication gap 555

A product briefing required a product to project a ‘masculine’ image. But what does this
really mean? The VCB (Figure 2) portrays a number of images of masculinity, from macho,
through traditional, to SNAG (Sensitive New Age Guy).

Figure 3 shows images associated with each of the four pleasures: physio, socio, psycho and
ideo (see Jordan and Macdonald 1998).

Figure 4. Human Diversity


Figure 4 shows images of human diversity. Who are our target group?

Conclusion
As increasing numbers of different specialists join product development teams e.g.
anthropologists, linguists, sociologists, designers and human factors specialists, it will be crucial
for new ‘bridging tools’ between these disciplines to be developed to allow a shared vision.
Pirkl and Babic (1988) have already produced visual demographic charts which translate
physiological data into product design specifications, a great facility for bridging two disciplines.

If human factors is to get its message across to the design community, it is essential that it
embraces a more visually-oriented approach to communication. This is particularly relevant
in the light of recent developments in human factors, for instance in the emerging desire to
articulate issues such as pleasure in products which lie outside the human factors specialists’
traditional sphere of interest.

One way to achieve this would be to ensure ergonomists had more training in envisioning
information and qualitative value judgement during their education.

References
Jordan, P.W., and Macdonald, A.S. 1998, Pleasure and product semantics. In S.A. Robertson
(ed.) Contemporary Ergonomics 1998, (Taylor and Francis, London)
Macdonald, A.S. 1997, Developing a qualitative sense. In N.Stanton (ed.) Human factors in
consumer products, (Taylor and Francis, London), 175–191
Pirkl, J.J. and Babic, A.L. 1988, Guidelines and strategies for designing transgenerational
products. (Copley Publishing Group, Acton, Massachusetts)
Wittgenstein, L. 1922, Tractatus logico-philosophicus. (Kegan Paul, London)
Proposition 5.6
GUIDELINES FOR ADDRESSING ERGONOMICS IN
DEVELOPMENT AID

T Jafry and D H O’Neill

International Development Group,


Silsoe Research Institute, Wrest Park, Silsoe
BEDFORD MK45 4HS

Subsistence agriculture in developing countries involves work which is


physically demanding and time consuming. Much of the work is done
manually because there are no suitable tools and implements available,
people cannot afford to purchase tools or they have no access to tools and
implements. Generally, women have longer working days than men because,
in addition to agricultural work, they are responsible for domestic work and
looking after children. The British Department for International
Development have recognised that the application of ergonomics can help to
reduce the drudgery and fatigue of agricultural and domestic work of poor
people, particularly women. An ergonomics guide has been produced and is
to be implemented as a matter of policy within DFID by the end of 1998.
The ergonomics guide is described in this paper.

Introduction
The most valuable resource of any country is its people, particularly so in developing
countries where other resources may be more limited. Furthermore, very poor people have
little more than their own resourcefulness (physiological and intellectual) on which to depend
for gaining their livelihoods. The application of ergonomics will enable poor people to make
effective use of their abilities and optimise their performance. In terms of development, the
benefits of a system which takes into account people and the way they work include:

• food security and enhanced economic status of households through increased productive
capacity
• reduced drudgery and fatigue from work
• more time available, especially for women
• improved health and quality of life
• fewer accidents and injuries.
Guidelines for addressing ergonomics in development aid 557

The White Paper on International Development


Through sustainable development, DFID’s (Department for International Development) aim
is to achieve a reduction by one half in the proportion of people living in extreme poverty by
2015. The White Paper on International Development (DFID 1997) states that sustainable
development requires the management and maintenance of four sorts of “capital” which
support human well-being:

• created capital: including machinery and equipment


• natural capital: the environment and natural resources
• human capital: human skills and capacity
• social capital: social relations and institutions

A policy which ensures that the human resources in a country are properly utilised through
bridging the relationship between these “capitals”, especially human and created capitals can
contribute directly to DFID’s aim. Introducing ergonomics (people-technology interaction) as a
matter of policy within DFID will help to achieve this and also complement other disciplines
such as economics, environmental and social development issues. In order to accommodate the
human factors, guidelines were developed as a practical tool for aid professionals to ensure that
ergonomics issues are adequately addressed in future aid programmes and projects.

Methods
Development of the guidelines
The guidelines were developed from three sources of information. Firstly, listening to the
needs of the customer, in this case DFID. Their requirements were to make the document
short, user-friendly and written in a positive tone. Secondly, the authors’ own knowledge and
experience of identifying and solving ergonomics problems in developing countries made a
major contribution. Thirdly, information was taken from reviewing the literature generated by
other authors covering ergonomics concerns in developing countries.

Testing the guidelines


A draft document was produced in January 1997 and circulated to DFID aid professionals for
their comments. Their views on how to improve the draft document were received and
incorporated into the guidelines, which have now been finalised.

Results
The guidelines document is split into 10 sections. The whole document cannot be described
in this short paper but the most important section, which contains the ergonomics checklist
for project screening, is give in Table 1.

Ergonomics Checklist
The questions in the checklist are underpinned by a single more fundamental question: how
will the project affect people? The checklist is split into five ergonomics concerns; the
individual, tools and equipment, working conditions, technology transfer, accidents and
injuries. The questions require either a YES or NO answer.
558 T Jafry and DH O’Neill

Table 1. Ergonomics Checklist

The remaining sections of the document contain appropriate information for aid
professionals to:

• learn what ergonomics is and why ergonomics inputs are beneficial


• be shown how to use a checklist to screen projects for potential ergonomics problems
• be informed of the key ergonomics concerns in developing countries
• be provided with case studies which explain how the checklist is used
• guided on where to get further advice and information.

Discussion
Guidelines have been developed as a practical tool to help aid professionals identify
ergonomics problems or potential problems in their programmes and projects. Use of the
guidelines has a number of benefits. In summary, these will:

• help DFID in their quest to eliminate poverty, through bridging the gaps between people
and technology;
• contribute to improving the design and implementation of DFID’s projects, particularly
regarding people-technology interaction, and thereby increasing their effectiveness;
Guidelines for addressing ergonomics in development aid 559

• facilitate the development, adoption or adaptation of tools and equipment to be better


matched to the physical and mental capabilities of local (indigenous) people;
• provide benefits to the millions of people who work in the cottage industries, through
simple low-cost improvements to the organisation of work, workplace layouts and work
schedules.

A programme of follow-up work is currently being conducted to integrate ergonomics into


DFID policy, research and bi-lateral programmes. The guidelines document will play a key
role in achieving this. The work will also continue to raise the profile of ergonomics in
development aid.

Acknowledgements
The authors would like to acknowledge the British Department for International
Development for funding this work.

References
DFID. 1997, Eliminating world poverty—a challenge for the 21st century, White Paper on
International Development, (The Stationary Office Limited, London)
DETERMINING AND EVALUATING ERGONOMIC
TRAINING NEEDS FOR DESIGN ENGINEERS

J Ponsonby, RJ Graves

Department of Environmental & Occupational Medicine


University Medical School, University of Aberdeen
Foresterhill, Aberdeen, AB25 2ZD

There have been many attempts to design and run different levels of
ergonomics training for engineers and other professionals. The current study
aimed to identify engineering ergonomic training needs in order to develop
and evaluate a support aid. The first stage involved surveying a sample of
design engineers to identify apparent gaps in their knowledge. The
information was used to design an ergonomics support aid for engineers and
a training package. The support aid was structured to help the designer to
think about the whole process including analysis of the tasks, sources of
information for the worker, and the implications for physical aspects of the
task. A pilot study with physiotherapists showed that the aid improved their
performance. The second part involved an experimental study with two
samples of engineers. The second stage is ongoing.

Introduction
Engineers are fundamental to the design process and can be highly influential in relation to
ergonomics issues in various levels of system design. It has been recognised for some time
that improving engineers knowledge of ergonomics can improve human factors aspects of
work design (Graves, 1992). There have been many attempts to design and run different
levels of ergonomics training for engineers and other professionals (see for example, Graves
et al, 1996). Little experimental evaluation of the degree of the effectiveness of different
types and levels of training, however, appears to have been carried out.
Classic approaches to ergonomics training seem to emulate the skills of the ergonomist in
designing the training material. It is not clear whether engineers and/or designers have the
same ergonomic training needs or whether different types of engineer require the same type
or degree of knowledge. Woodcock and Galer Flyte (1997) surveyed automotive designers in
relation to product design and found that although ergonomics was considered throughout
design, it was not considered rigorously enough. In addition, there are questions about the
levels of existing knowledge and training of designers involved in manufacturing processes.
Ergonomic training needs for design engineers 561

Graves et al (op cit.) in training around 160 manufacturing “design” engineers was faced with
varying levels of engineering knowledge and capability. At the end of the two day courses,
small teams of participants undertook design exercises based on their work environments to
confirm their ability to take account of basic ergonomics principles in their design solutions.
Although the majority appeared to be able to use the task and risk analysis approach and the
workspace tools, there was a surprising variation in basic engineering knowledge.
The current study aimed to identify engineering ergonomic training needs in order to
develop and evaluate a training package designed to satisfy these needs (Ponsonby, 1998).
This paper describes the development of the support aid and the preliminary results from a
pilot study with physiotherapists to test components of the aid.

Approach

Overview
The first stage involved surveying a sample of design engineers from manufacturing and
process companies and identifying apparent gaps in their knowledge.
In the first part of the second stage, the information was used to help design a prototype
ergonomics support aid for engineers and an associated training package. A pilot study of the
prototype support aid was then undertaken to assess the effectiveness of the material.
The second stage involved designing an experimental study with two samples of
engineers. This is ongoing and is designed to determine the levels of ergonomics knowledge
used by both samples of engineers in an applied synthetic task. One sample will be trained
using the package and then both retested using a synthetic task to measure differences in
performance.

Stage 1
The two companies involved were a cardboard manufacturer (Company A) and a biscuit
manufacturer (Company B). Informal interviews were undertaken with a sample of engineers
to determine their level of knowledge of ergonomics, where they gained that knowledge and
how they used it. A questionnaire was developed from the results of these discussions.
A pilot study of the questionnaire was undertaken on each site. Following this, several of
the questions were altered, and the final questionnaire was then sent out to a wider audience
and other sites. The questionnaire was used to identify gaps in the knowledge and thinking of
the design engineers. This information was used in the development of the support aid. The
main survey was undertaken by post to engineers in both companies throughout the United
Kingdom and abroad (fourteen design engineers in company A and twenty in B)
Several informal on-site visits were used to help determine the types of ergonomic risk to
which operators were being exposed. A specific area of production was chosen to be
investigated in more detail as an example of where gaps in knowledge had been identified to
help in the development of the support aid. Figure 1 illustrates an overview of the process
with the numbers referring to specific sections. Each section contained worksheets designed
to support the users decision making. The support aid was structured to help the designer
think about the worker in the whole process and the role of the worker. The next steps covered
562 J Ponsonby and RJ Graves

analysis of the tasks, sources of information for the worker, and the implications for physical
aspects of the task…
In addition, the extent and costs of musculo-skeletal problems was assessed by

Figure 1 Overview of ergonomics support aid sections

examining sickness absence data, insurance claims and turnover. Information was collected
on the different tools and check lists available at the moment to assist both design engineers
and anyone else involved in looking at ergonomic problems, both existing and new. The
Ergonomic training needs for design engineers 563

questionnaires were analysed to identify ergonomic gaps in knowledge and these results in
conjunction with the task analysis and current support tools were all drawn together to create
a support aid to be used by design engineers to improve their application of ergonomics to
both new and existing projects.
A pilot study of the prototype support aid was undertaken by four volunteer physiotherapy
staff prior to the main exercise to identify any problems. They had the same pre-assessment
exercise intended for the engineers. The four volunteers were given an instruction sheet and
asked to watch a video recording of a sample of work practices and asked to identify any
ergonomic issues, concentrating on possible musculoskeletal problems. A separate
information sheet provided additional information on the work area. The results from this
were scored against pre-set criteria.
This was followed by a three hour training session which involved instruction and practice
in using the prototype support aid. The training was a condensed version of that planned for
the engineers because of time constraints on these volunteers. A post-assessment similar to
the pre-assessment was undertaken. This involved using the support aid. This was scored
using the pre-set criteria and a questionnaire was completed by all the participants at the end
of the exercise. Modifications were made to the support aid following the post- assessment
exercise.

Stage 2
The project plan was to carry out a study of the support aid in both industrial environments but
company B had difficulty in committing time to the project. The revised plan involved having
two groups of design engineers from company A, with both groups having the same pre-course
assessment to identify two equally matched groups for the experimental trials. One would use
the new support aid while completing the same video based exercise. The two groups would be
compared to see if using the support aid improved the ergonomic performance of the engineers.
The pre training assessment is on-going due to delays at the company.

Preliminary Results and Discussion

Stage 1
Eighteen out of 34 questionnaires (53%) were returned from the design engineers (12 from
Company A and 6 from B). Four respondents had a degree, two a Diploma, five an HNC in
engineering and seven “other”. Ten said they had limited ergonomics knowledge, seven
moderate and one extensive knowledge. The questionnaire responses of the respondents from
both companies to technical ergonomic questions revealed that the overall number of correct
answers tended to be low. For example, for all but one question on manual handling and one
on the upper limb, half or fewer of the respondents knew the correct answers This showed
that the designers did not appear to have adequate ergonomics knowledge.
Table 1 shows the results from the pilot trials of the support aid with the physiotherapists.
All the group improved upon their initial score after training indicating that they considered
more of the pre-set ergonomic criteria using the support aid. But the maximum score overall
even after using the support aid was just above half the total attainable (25 out of 42 see Table
1) indicating they were still missing over half of the relevant points or were not recording
them.
564 J Ponsonby and RJ Graves

Table 1 Percentage improvement between pre and post test scores from
pilot trials of the support aid with the physiotherapists

Table 2 shows the results from the pilot trials of the support aid with the physiotherapists
in relation to the detailed pre-set criteria categories. Clearly there were a number of areas
where further support could be required, e.g. task analysis, sources of information and of
physical demand, and additional information. From the pilot study,

Table 2 Percentage improvement between pre and post test scores


in relation to pre-set criteria

however, it can be concluded that performance of physiotherapists improved by using the


ergonomic support aid. In addition, the video exercise seemed to be suitable to test both pre
and post test the criteria. A number of sections within the tool were improved to take account
of post pilot study feedback. It now remains to be seen whether a similar improvement can be
obtained from the engineers in the on-going next stage of the study.

References
Graves, R.J. 1992, Using ergonomics in engineering design to improve health and safety. In
J.D.G.Hammer (ed.) European Year of Safety and Hygiene at Work Special Edition
Safety Science, 15, (Elsevier, Amsterdam) 327–349
Graves, R.J. Sinclair, D. Innes, I. Davies, G. Bull, G. Burnand M. 1996, Applying
ergonomics research and consultancy in a manufacturing design process to reduce
musculoskeletal risk. Annual Scientific Meeting of the Society of Occupational
Medicine, Society of Occupational Medicine: Birmingham
Ponsonby, J. 1998, An evaluation of the ergonomic training needs for design engineers, MSc
Ergonomics Project Thesis, Department of Environmental and Occupational
Medicine, University of Aberdeen: Aberdeen
Woodcock, A. Galer Flyte, M.D. (1997) ADECT—Automotive Designers Ergonomics
Clarification Toolset. In S.A.Robertson (ed.) Contemporary Ergonomics 1997, (Taylor
and Francis, London), 123–128
ERGONOMIC IDEALS vs. GENUINE CONSTRAINTS

Duncan Robertson, Simon Layton & Jayne Elder

Human Engineering Limited


Shore House, 68 Westbury Hill,
Westbury-on-Trym,
Bristol, BS9 3AA.

Ergonomics is by its very nature an applied discipline. University courses


prepare students well in terms of technical knowledge, but often give less
guidance on the constraints imposed in a commercial environment. This
paper presents case examples illustrating the experience of incorporating
ergonomics into the design of two train cabs.

Background
This paper describes the experience of incorporating ergonomics into the design of two train
cabs, whilst taking into account the genuine constraints associated with such a project. The
cabs were developed by Human Engineering Limited for a UK train manufacturer.
Cab ‘A’ was for use in driver/guard operation, and cab ‘B’ was for use in driver only
operation (DOO). Each type of operation has a number of unique requirements with respect
to the cab design. The two cabs also differed in their layout. Cab ‘A’ was a ‘full-width’ cab,
with the driving position on the left, and the non-driver on the right. Cab ‘B’ however, had a
central gangway to allow two units to be coupled together. Using these gangways passengers
can walk the length of the train. This design resulted in the driver’s console being
approximately one third smaller than in the full-width cab—with the same amount of
equipment to be located in both cabs.

Approach
The approach that was taken can be broken down into three stages. Initially (during Stage 1) a
task analysis of the driver’s role was performed for each of the existing cabs.
The second stage was to model the cab environment, using the human 3D modelling
package MQPro. The aim of this stage was to assess elements of the design, such as access/
egress, clearance, driver posture, reach envelopes and sightlines. For example, Figure 1
shows the reach envelope for a 5th percentile (%ile) UK male, within cab ‘A’.
566 D Robertson, S Layton and J Elder

Figure 1–5th %ile UK male manikin within cab ‘A’

During Stage 3 a number of elements of the design work were carried out in parallel. This
included the specification and design of controls, audible warnings, displays, and the panel
layouts. The proposed traction-brake controller was also assessed.

Constraints
Due to the nature of the project there were a number of constraints imposed upon the cab
designs. These constraints can be divided into four categories: cost, engineering, panel/
control size and time constraints.
Cost constraints included, for example, limitations in the new technology incorporated
into the cabs; a number of the controls and displays, such as the speedometer, brake pressure
gauges, etc., were required by the manufacturer to be tried and tested, off-the-shelf items. The
structural design of the front end of the train and cab also imposed engineering constraints on
the location of controls. For example, the position of the connecting corridor in cab ‘B’
limited the positioning of controls on the driver’s right side wall. In addition, the size of
certain controls and console panels restricted the control location and orientation.
Consequently this had a knock-on effect on the grouping of controls in that there was
sometimes insufficient space available to achieve the ideal functional grouping. Finally, the
entire modular train development, from concept to production, was taking place over a period
that was half the industry standard at that time.

Design Process
Figure 2 shows the design process which was followed throughout the project for items within
the cabs with examples for each stage. Initially design recommendations were made by the
ergonomists taking into account a set of constraints (for example, the switches and buttons must
be from one of two suppliers). The preferred solutions were sometimes rejected by the client (or
a supplier) when additional constraints restricted the recommendation’s viability.
Ergonomic ideals vs. genuine constraints 567

Figure 2—Design process (on left), and a worked example

When recommendations were rejected, a number of alternative solutions would then be


put forward. These not only conformed to the additional constraints, but also applied as many
ergonomic principles as possible. For example, if it was not possible to locate a control in the
ergonomically ideal position due to insurmountable structural constraints, the most suitable
alternative would be identified and principles of coding, size, etc., would be adhered to.
In the instances when an item of equipment failed to meet several ergonomic principles, a
formal statement would be submitted to the client outlining the potential problems for the
driver. The client would, on the basis of that evidence, then decide the course of action to take
to enable as many ergonomic principles to be applied as possible. The following paragraphs
outline two case examples.

Case Examples

Communications Unit
The communications unit initially proposed for the cab (which was to be supplied by a
company within the manufacturer’s group) did not comply with a number of ergonomic
principles (see Figure 3A). These included the grouping, labelling, spacing and colour coding
of controls.
Therefore, whilst acknowledging constraints such as overall size and general functionality
of the unit and minimising the number of major changes, the ergonomists then proposed their
preferred solution (see Figure 3B). The controls on the panel were spaced and labelled
appropriately, lines were used to group related controls, and colour was used to identify
important controls (e.g. ‘acknowledge alarm’ and ‘emergency egress’).
568 D Robertson, S Layton and J Elder

Figure 3—Communications Unit: A) supplier’s original design,


B) preferred solution, C) compromise design

Due to space restrictions, and constraints in the manufacturing process used, the space
between the controls could not be increased. Therefore, a compromise solution was put
forward to the client which took into consideration this constraint (see Figure 3C). This
solution still incorporated the grouping and labelling principles used in the preferred solution.

Traction-brake Controller (TBC) Track Layout


The traction-brake controller is a hand control used by the driver to initiate either traction and
power or braking. The driver either pushes the controller to start braking or pulls it to engage
power. The track is divided into a number of steps. The customer for Cab ‘B’ requested a
continuous movement from ‘low brake’ to ‘full brake’, four defined steps for power settings
and a ‘Reserve Power’ step which would only be used in emergencies.
Figure 4A shows the preferred solution which was recommended to the client. The
triangle between the ‘low brake’ and the ‘full brake’ gives the impression of a continuous
movement. To enter the ‘Reserve Power’ notch the driver must move the TBC to the left and
break a witness seal. The seal would deter the driver from entering the notch in a
non-emergency situation. The braking labelling was coloured red, and the power labelling
green.
Figure 4B shows the final TBC track layout. The extra notch to enter ‘Reserve Power’
could not be achieved due to a combination of space, engineering and cost restrictions.
Complete words could not be used in the labelling because of space constraints. Therefore
Ergonomic ideals vs. genuine constraints 569

alternative labelling had to be used e.g. ‘H’ for ‘full brake’, ‘L’ for ‘low brake’ and ‘E’ for
‘emergency brake’.
However, red was still used to differentiate between power and braking, and the triangle
(also coloured red) was still used to identify the continuous braking. Green could not be used
to colour the power steps because of cost constraints.

Figure 4—TBC track layout: A) the ergonomic ‘ideal’,


B) the final track layout.

Conclusion
Ergonomists must deal with design constraints, be they cost, engineering, political etc.,
regularly during their working lives. Unless the constraint ‘goal posts’ can be moved through
dialogue with the client, compromises must be made.
A human factors/ergonomics degree provides a student with the technical knowledge
needed to produce ergonomically ideal solutions to problems. However, in most cases
students are not given guidance on how to make compromises to their designs. Such guidance
would make for a smoother transition between the academic and commercial sectors for
newly graduated students whilst maintaining ergonomic integrity in design and a harmonious
relationship with the client.
GENERAL
ERGONOMICS
ANOTHER LOOK AT HICK-HYMAN’S REACTION
TIME LAW

Tarald O.Kvålseth

Department of Mechanical Engineering


University of Minnesota
Minneapolis, MN 55455, USA

The classic Hick-Hyman’s law of choice reaction time is re-examined in two


respects. First, it is re-emphasized that this lawlike relationship holds for the
total set of stimulus-response pairs, but is not capable of accounting for the
reaction times of individual stimuli. Second, while Hich-Hyman’s law is
based on an abstract information-theory measure that lacks any meaningful
interpretations, an alternative model is considered based on a predictor
variable that has intuitively appealing interpretations in probabilistic terms.
Some numerical data are used to compare the two models.

Introduction
Consider the general reaction-time paradigm involving n potential stimuli with a probability
distribution P=(p1,…, pn) and with each stimulus requiring a specific response. That is,
Stimulus i occurs with probability p1 on any given trial, with i= 1,…,n and p1+…+pn=1, and
the subject responds as quickly as possible. In the case of error-free performance, or nearly
so, the classic relationship known as Hick-Hyman’s law, after Hick (1952) and Hyman
(1953), is given by

(1)

where RT is the overall mean reaction time for all stimuli, a and b are parameters to be
empirically determined, and I(P) is the mean information content, or uncertainty, of one
stimulus event as measured by the information measure due to Shannon (1948). When the
stimuli are all equally likely (i.e., p1=…=pn=1/n), Equation (1) reduces to

(2)
Hick-hyman’s reaction time law 573

In the case of a single stimulus (n=1), if follows from both equations that RT=a so that a is
the so-called simple reaction time.

While the general validity of this law as expressed by Equations (1)–(2) has been established
on the basis of numerous empirical studies, it does not hold for the reaction time to individual
stimuli. In the present paper, we shall briefly re-examine this limitation of Hick-Hyman’s
law, which often appears to be either neglected or misunderstood in the published literature.
Also, we shall consider an alternative formulation for the overall mean reaction time that is
based on the expected probability or repetition probability as a meaningful independent
variable. Some empirical results will also be presented.

RT for Individual Stimuli


Even though the formulation in Equation (1) appears to be appropriate as an average
description for the total stimulus set, it does not appear to apply to the reaction time for
individual stimuli. According to Hich-Hyman’s law, the reaction time RTi for Stimulus i with
probability pi should be a linear function of the information content (uncertainty) I(pi)=-log2pi
of the ith stimulus event with the same parameter values as those of a and b in Equation (1)
for the same experimental data set. However, as was first pointed out by Hyman (1953), this
requirement does not appear to be met, stating that “we would expect the mean reaction time
to each of the components within a condition to fall on the regression line which was fitted to
the over-all means of the conditions…. But such was not the case (p. 194).” As a specific
example, Hyman (1953) mentioned the case of one subject and observed reaction times
306 and 585 msec for two stimuli with respective information content of I(13/16)=.30 and
I(1/16)=4.00 bits whereas the corresponding reaction times predicted by the fitted regression
model for all stimuli were 258 and 824 msec, respectively. Without providing the specific
data, Hyman (1953) stated that such findings were typical for all subjects and conditions, i.e.,
low-information stimuli have larger reaction times and high-information stimuli have smaller
reaction times than those predicted by the overall model in Equation (1) involving all stimuli.
Such inconsistent findings have raised serious concern about the general validity of Hick-
Hyman’s law (see, e.g., Laming, 1968, p. 10; Luce, 1986, pp. 392, 405). While this limitation
of the law is appreciated by some authors, it is either ignored or misunderstood by others.
Some authors have specifically and incorrectly stated that the reaction times for individual
stimuli fall along the linear function in Equation (1) for the average reaction times of all
stimuli (e.g., Wickens, 1992, p. 318).
If the reaction time RTi for a particular stimulus is considered to be a linear function of its
information content I(pi)=-log2pi , then the inconsistency referred to above would imply that
this linear function has parameter values that differ from those of the overall mean
relationship in Equation (1). That is, if Equation (1) and the equation

RTi=a+bI(pi) (3)

are both fitted to the same set of experimental data, we would expect the parameter estimates
to be different for Equations (1) and (3), whereas they should be the same according to Hick-
Hyman’s law.
In order to test the proposition of differing parameter values for the two fitted models, we
shall reanalyze some experimental data by Fitts et al. (1963; Fig. 2, Session 4) as also given
574 TO Kvålseth

by Fitts and Posner (1967, p. 102) and involving n=9 stimulus-response pairs as well as data
by Theios (1975) for n=2. When Equations (1) and (3) are fitted to Fitts et al’s data by means
of linear regression analysis, the following results are obtained:

(4)

with the values of the coefficient of determination R2=.98 for the RT model and R2 =.92 for
the RTi model. Similarly, based on the data from Theios (1975),

(5)

with the respective R2 values of .92 and .90.


It is apparent from these results that the parameter values for Equations (1) and (3) may
indeed differ substantially. This is especially true for the results in Equation (5) where the
values of the slope parameter b differ by a factor of about three between the two linear models
and the estimated values of the simple reaction time (a) differ by 63 msec. When looking at
scatter plots for these data (see, e.g., Fitts and Posner, 1967, p. 103), it would also appear that
Equation (3) is simply not an appropriate formulation, with clear departures from linearity for
extreme values of I (Pi). Thus, while Hick-Hyman’s law appears to be an appropriate
relationship for the total stimulus set, it fails to account for the reaction time to individual
members of the stimulus set. To quote Hyman (1953), “If, however, we are interested in the
behavior of the components making up the conditions, we must find different laws and
equations (p. 194).” However, no such well-fitting model for individual stimuli appears yet to
have been presented in the published literature.

An Expected Probability Model


Hick-Hyman’s law in Equations (1)–(3) is based on the proposition that the choice reaction
time is a linearly increasing function of the stimulus information content (uncertainty), using
the information measures developed by Shannon (1948). Although the measures by Shannon
are the most commonly used ones in various fields a study, entire families of alternative
information measures have been proposed (see, e.g., Kapur, 1994). However, while
Shannon’s and other information measures have a number of desirable mathematical
properties, they have an important limitation: they lack meaningful interpretations. The
numerical values of such measures are abstract numbers lacking any meaningful or
operational interpretations in some probabilistic sense.
As a potential alternative and meaningful predictor of the overall mean reaction time, we
shall consider the self-weighted arithmetic mean probability M(P)= p12+…+pn2, i.e., the weighted
arithmetic mean of the probabilities p1,…,pn weighing each probability with itself. The M(P) is
also called the expected probability (e.g., Fry, 1965, p. 210) since it is the statistical expectation
of a random variable that takes on the potential values pi with probabilities pi for i=1,…,n. It
may also be called the repetition probability since M(P) can be interpreted as the probability
that the same stimulus occurs twice during two independent experimental trials (replications).
Furthermore, M(P) may be interpreted in a predictive sense as the probability of correctly
predicting which stimulus will occur on any given trial when the various stimuli are a priori
predicted to occur with given probabilities p1,…,pn (by, say, tossing an n-sided die whose sides
appear with probabilities p1,…,pn). Clearly, pi2 is the probability of a correct prediction for
Hick-hyman’s reaction time law 575

Stimulus i (i.e., the probability of predicting that Stimulus i will occur when it does actually
occur) so that M(P) as the sum of all the p12 (i=1,…,n) is the probability of correct prediction
whichever stimulus occurs on any given trial.
It seems reasonable to postulate that the overall mean reaction time RT is some monotonic
decreasing function of M(P). With M(P) interpreted as the expected stimulus probability,
increasing value of M(P) implies increasing overall expectancy and level of preparation on
the part of a subject, which in turn causes faster responses (reduced RT). Similarly, with M(P)
interpreted as the probability of a correct stimulus prediction, it seems intuitive that any
increase in this probability should cause RT to decrease since responses to correctly predicted
stimuli ought to be faster than those involving erroneous predictions. Finally, with M(P)
being a repetition probability, as stated above, there is considerable experimental evidence to
suggest that RT should be decreasing in M(P); see, for example, Luce (1986, Sec. 10.3) for a
review of studies showing that reactions tend to be faster when a stimulus is repeated
(sequential) than when it is not.
As to the specific form of this functional relationship, the following power model appears
to provide reasonably good fits to experimental data:

(6)

and, when p1=…=pn,

(7)

where α and β are positive parameters to be empirically determined. When n=1, RT =α so


that α is the simple reaction time. The relationship in Equation (7) has previously been
proposed by Kvålseth (1980).
To explore the goodness of fit of Equations (6)–(7) to experimental data and make
comparisons with Hick-Hyman’s law in Equations (1)–(2), a number of published data sets
were reanalyzed. The models in Equations (6)–(7) were fitted to the experimental data by
means of nonlinear regression analysis and the appropriate coefficient of determination (R2)
was properly computed (Kvålseth, 1985). The results are summarized in Table 1.
It is apparent form the R2 values in Table 1 that both Hick-Hyman’s law in Equations (1)–
(2) and the power model in Equations (6)–(7) provide good fits to the experimental data, with
little difference between the two models. However, while Hick-Hyman’s law is based on the
measure I(P) that lacks any operational interpretation, the power model is based on the mean
(expected) probability M(P) that has intuitively appealing interpretations in terms of the
repetition probability or the probability of correct stimulus predictions as discussed above.
According to Equation (6), a relative or fractional change in this probability (i.e., ∆M(P)/
M(P)) causes a fractional change in reaction time that is proportional to the probability
change with the negative proportionality constant -β, ranging from -.12 to -.50 for the data in
Table 1.
576 TO Kvålseth

Table 1. Some sample comparisons between Equations (1)–(2) and (6)–(7), with the
unit of RT being msec.

Notes: (a) Average data for the four subjects were used; (b) Merkel’s data are, for instance,
given by Keele (1986); and (c) These data were for the “discrimination” situation.

References
Crossman, E.R.F.W. 1953, Entropy and choice time: The effect of frequency unbalance on choice response, Quarterly
Journal of Experimental Psychology, 5, 41–51.
Fitts, P.M., Peterson, J.R. and Wolpe, G. 1963, Cognitive aspects of information processing: II. Adjustment to
stimulus redundancy, Journal of Experimental Psychology, 65, 423–432.
Fitts, P.M., Posner, M.I. 1967, Human Performance, (Brooks/Cole, Belmont, CA).
Fry, T.C. 1965, Probability and Its Engineering Uses, Second Edition, (Van Nostrand, Princeton, NJ).
Hick, W.E. 1952, On the rate of gain of information, Quarterly Journal of Experimental Psychology, 4, 11–26.
Hyman, R. 1953, Stimulus information as a determinant of reaction time, Journal of Experimental Psychology, 45,
188–196.
Kapur, J.N. 1994, Measures of Information and Their Applications, (Wiley, New York).
Kaufman, H., Lamb, J.C. and Walter, J.R. 1970, Prediction of choice reaction time from information of individual
stimuli, Perception & Psychophysics, 7, 263–266.
Keele, S. 1986, Motor control. In K.Boff, L.Kaufman and J.Thomas (eds.) Handbook of Perception and Human
Performance, Vol. 2: Cognitive Processes and Performance, (Wiley, New York).
Kvålseth, T.O. 1980, An alternative to Hick-Hyman’s and Sternberg’s laws, Perceptual and Motor Skills, 50,
1281–1282.
Kvålseth, T.O. 1985, Cautionary note about R2, The American Statistician, 39, 279–285.
Laming, D.R.J. 1968, Information Theory and Choice-Reaction Times, (Academic Press, London).
Luce, R.D. 1986, Response Times’ Their Role in Inferring Elementary Mental Organization, (Oxford University
Press, Oxford)
Shannon, C.E. 1948, A mathematical theory of communication, Bell System Technical Journal, 27, 379–423,
623–656.
Theios, J. 1975, The components of response latency in simple human information processing tasks. In P.M.A.Rabbitt
and S.Domic (eds.) Attention and Human Performance V, (Academic Press, London), 418–440.
Wickens, C.D. 1992, Engineering Psychology and Human Performance, (Harper Collines, New York).
DESIGN RELEVANCE OF USAGE CENTRED
STUDIES AT ODDS WITH THEIR SCIENTIFIC STATUS?

H.Kanis

School of Industrial Design Engineering


Delft University of Technology
Jaffalaan 9, 2628 BX Delft, the Netherlands

The application of reliability/repeatability and validity as criteria to assess


the scientific status of empirical research becomes less relevant to usage
centred studies of domestic products as these studies are more relevant and
applicable by designers.

Introduction
Notions like reliability/repeatability and validity may be seen as necessary criteria in the
establishment of the scientific status of empirical research. The question dealt with in this
paper is to what extent these criteria, in the area of Ergonornics/HumanFactors (E/HF), are
appropriate for usage centred research for the design of everyday products.

Criteria

Reliability/repeatability
In E/HF, the degree to which measurements or observations are free from dispersion is
addressed by the terms reliability (from the social sciences) or repeatability/ reproducibility
(from the technical sciences), see Kanis, 1997a. Aside from the unfortunate difference in
terminology, these notions as such reflect the amenability of measurement to repetition,
which constitutes a basic consideration in scientific research.

Validity
In E/HF, the extent to which observations are free from deviation or bias is addressed by the
term validity, drawn from the social sciences. The identification of deviation thrives on
limitations in dispersion: the more repeatable measurement results are, the narrower the range
within which these results cannot be demonstrated to differ systematically. The difficulty
with the term ‘validity’ is its wide interpretative span, ranging from the ‘tenability’ of a
model, via measurement results as ‘being (un)biased’ to a method ‘doing a good job’ and the
‘adequateness’ or ‘acceptability’ of a particular approach or procedure, i.e. ‘valid(ity)’ used
as common parlance, see Kanis, 1997b. In order to avoid semantic confusion in this paper,
empirical findings, or conclusions based on those findings, will be assessed for ‘deviation’,
rather than ‘(in)validity’.
578 H Kanis

Usage centred research for everyday product design


In Figure 1, a graphical representation is given of the functioning of a product operated by a
user. In Kanis (1998), it is argued that user activities (perceptions/ cognitions, use actions,
including any effort involved) are the key-issues in usage oriented design of everyday products.
For these user activities, human characteristics as indicated at the right in Figure 1, mainly serve
as tokens for general boundary conditions (Green et al., 1997), rather than as a base to predict
future activities and experiences of users. The attention paid to human characteristics in textbooks
on design engineering seems to be the result of the relatively good measurability of those
characteristics, rather than their design relevance. This is further illustrated by looking into the
application of measurement criteria to different types of human involvement.

Different types of human involved measurement/observation


In Kanis (1997a), the following distinction is put forward for human involved observation in
the area of E/HF:
- measuring ‘at’ human beings, such as anthropometrical characteristics with a mainly
passive/involuntary role of subjects, e.g. body mass, arm length, hand breadth;
- measuring/observing ‘through’ human beings, i.e. the recording of (results of) activities in
carrying out a task such as pronation/supination, the performance in any force exertion, and
the number of work movements;
- registration of self-reports, i.e. about perceptions, cognitive activities, and experienced
effort aired by subjects on the basis of their internal references.
Matched with Figure 1, this tripartition results in the following typification of human
involved measurement/observation within the user-block:
- ‘at’-measurands (with the term measurand adopted from ISO 1993, as the object or
phenomenon intended to be measured) at the right, i.e. as human characteristics,
- ‘through’-measurands occurring both as human characteristics (at the right), e.g. eyesight,
memory capacity, reaction time, joint flexibility, exertable forces, and as user activity (at the
left), particularly as use actions in the operation of products, and
- ‘self’-measurands at the left, including perceptions/cognitions and effort experienced in any
user activity.

Figure 1. The functioning of a


product operated by a user in
order to achieve some goal
(from Kanis, 1998)
Design relevance of usage centred studies at odds with their scientific status? 579

Occasionally, ‘through’- and ‘self’-measurands are combined, e.g. in the exertion of a


force experienced as comfortable. Assuming for the time being the adequateness of this
structuring of human involved research in a design context, the viability of research criteria
(see above) is now scrutinised for the identified types of measurands.

Specification of dispersion

‘At’-measurands
This type of measurement resembles the classical ideal from the natural sciences, involving
measurands which can be specified in a method, that is: as existing ‘out there’. Throughout
subjects, a more or less constant or so-called homoscedastic dispersion can be expected,
which means that the repeatability is a constant.

‘Through’-measurands
In this case, the involvement of human activities generally precludes numerous repetitions
per subject due to possible carry-over. Hence the application of research-designs involving
several subjects with only a limited number of repetitions per subject, in particular the test-
retest. In this type of research, a heteroscedastic dispersion tends to be found, including some
proportionality throughout subjects between the difference and the mean of test-retest results.
In Kanis (1997a), the relevance of accounting for different types of dispersion patterning, i.e.
the non-constancy of the repeatability, is discussed as regards the setting of margins in
design. For that matter, E/HF studies regularly discuss dispersion in case of ‘through’-
measurands deficiently, on the basis of the reliability coefficient r (Pearson), a fallacy
originating from the social sciences.

‘Self’-measurands.
In general, the registration of self-reports cannot be treated in terms of dispersion, since a
reasonably argued repetition for this type of recording is illusory due to the irrevocability of
perceptive or cognitive experiences. The same applies to the reporting of effort in view of the
evasiveness of any alleged constancy of internal references in human beings.

Questioning deviation
In order to avoid the muddled use of the concept of validity, and also since corroboration,
rather than a once and for all ‘validation’, is all that is achievable, questioning syntaxes have
been developed. Basically, two distinctive notions may be questioned when deviation is
observed (Kanis, 1997b):

(i) a claimed measurand, see the questioning syntax in Figure 2, and


(ii) a proposition underpinning a prediction such as an inference or generalisation.

In the first case (i), some considerations (theoretical, logical) predicting a certain relationship
between fa and fb° are taken for granted, i. e. as prop° (see Figure 2), for instance that a
maximally exerted force F max should exceed a comfortably exerted force, F comf. As
occasionally it is found that Fmax≤Fcomf(Kanis, 1994), in such a case at least Fmax should be
questioned as a claimed measurand.
In the second case (ii), the empirical observations to be compared are adopted as
unquestioned. This leaves only prop as amenable to reconsideration when the comparison
fa°#fb° yields a ∆, see note in Figure 2. This is at issue in measuring Fcomf and Fmax (see
580 H Kanis

Figure 2. Questioning a claimed measurand

example above) if fa is termed as ‘the maximum force exerted, given the instruction to do so’,
which both renders fa unquestioned by definition and raises theoretical questions as to
psycho-motoric differences between subjects after the same instruction.

‘At’-measurands
In case of evidence for deviating measurement results, the identification of deviation is more
or less straightforward, such as doing a measurement again with a critical eye for a suspect
part of it, or applying a different method (triangulation).

‘Through’-measurands
The example of force exertion (see above) illustrates that this type of measurand may not be
completely specified by a method: the same instruction can evoke different reactions of
people. In addition, an instruction can never be guaranteed to be unambiguous. Hence, the
identification of deviation is evasive in as far as ‘through’—measurands rely on
interpretations and reactions by people.

‘Self’-measurands
This type of measurand largely escapes from specification by any method. Then,
identification of deviation is seen as virtually impossible insofar as internally referenced
human activities cannot reasonably be linked to observables in order to produce supporting or
conflicting evidence. An example is people reporting their perceptions and cognitions. What
they say cannot, by definition, be questioned as a claimed measurand if termed as ‘people’s
utterances, being asked in a certain way to air what they see/saw, think/thought (etc.)’, i.e. as
fa° in the syntax in Figure 3. It is theoretical considerations, i.e. prop, on which the inference

Figure 3. Questioning a proposition


Design relevance of usage centred studies at odds with their scientific status? 581

is based that what people for instance say in ‘thinking aloud’, is what they are really thinking.
This inference is troublesome to challenge because of the difficulty to produce evidence to
compare with, i.e. fb0 in Figure 3, see further Rooden, 1998.

Summary and discussion


The message of the summary in Table 1 seems clear: the more design relevant usage centred
research, the less this research can accommodate for the adopted scientific criteria, i.e. the
specification of dispersion (‘reliability’/‘repeatability’) and any questioning of deviation
(‘validity’). The point is that measurands of interest in design are essentially interactive,
rather than conceivable as existing ‘out there’. Hence, it may come as no surprise that usage
centred design efforts can only benefit to a limited extent from ‘hard’ science thriving on the
positivistic ideal that any interactive ‘fuzziness’ in user-measurement confrontation may be
nullified. How can interaction be both the target in the study of product usage and,
simultaneously, be denied in the application of measurement techniques which are applied to
observe that interaction? For that matter, the interactiveness of human involved measurement
largely undermines the call for ‘valid’ methods in E/HF. Virtually, in design oriented usage
centred research, the notion of validity reduces to ‘credibility’ or ‘plausibility’, as seems
viable in qualitative research.

Table 1. Design relevance vs. scientific criteria for different types of measurands

References
Green, W.S., Kanis, H. and Vermeeren, A.P.O.S. 1997, Tuning the design of everyday
products to cognitive and physical activities of users. In S.A.Robertson (ed.)
Contemporary Ergonomics, (Taylor and Francis, London), 175–180
ISO, 1993 Guide to the expression of uncertainty in measurement, International Standard
Organisation, Geneva
Kanis, H. 1994, On validation, In Proceedings of the Human Factors Society 38th Annual Meeting,
(Human Factors and Ergonomics Society, Santa Monica, CA, USA). 515–519
Kanis, H. 1997a, Variation in measurement repetition of human characteristics and activities,
Applied Ergonomics, 28, 155–163
Kanis, H. 1997b, Validity as panacea? In Proceedings 13th IEA Congress, (Finnish Institute of
Occupational Health, Helsinki), 235–237
Kanis, H. 1998, Usage centred research for everyday product design, Applied Ergonomics, 2,
75–82
Rooden, M.J. 1998, Thinking about thinking aloud. In M.Hanson (ed.) Contemporary
Ergonomics, (Taylor and Francis, London), this issue
The Integration of Human Factors considerations into
Safety and Risk Assessment systems.

J.Lola Williamson-Taylor, Ph.D, MIOSH

AWE Plc, Aldermaston, Reading, UK. RG7 4PR

This paper describes a methodology for Human Factors Integration (HFI)


into safety and risk assessment for safety case purposes. It explains the
identification, assessment and screening of human factors contribution to
major hazard scenarios and the treatment of human deficiency/recovery
tendencies in safety management systems, safety culture and organisational
factors as an integral part of the risk assessment process. The paper is
freely formatted and aimed at risk assessment specialists with safety case
expertise.

1.0 Introduction
The safety and risk assessment of hazardous installations such as chemical, explosives and
nuclear processing facilities, their supporting functions and management control systems are
required to incorporate human factors considerations. The complexity of human involvement
in such facilities means that the associated human factors are not constrained to human and
machine interface type only, but also include those inherent in the safety management
systems, safety culture and organisational frameworks (Andersen et al, 1990; Joksimovic et
al 1993; Pate-Cornell, 1990).
In the UK, the safety case approach has been fairly well established as an effective and a
comprehensive means of justifying the safety of hazardous installations throughout their life
cycle. The safety case requirement is set to be extended to other industry sectors, as seen from
the recent extension to the offshore, railway and mining industries. The incoming Control of
Major Accident Hazard Regulations (COMAH), currently known as SEVESO 11 directive,
will further underpin the current regulatory trend.
The risk assessment of major accident hazards and significant safety concerns is a crucial
part of a safety case for regulatory purposes such as licensing and permissioning. The causes
Human Factors Integration into Safety and risk assessment systems 583

of technical and hardware failures and operator errors have been linked to deep-seated human
factors in the management decision processes and organisational frameworks. However, these
are often omitted or poorly treated in safety cases, mainly due to lack of a practical
integration methodology. Also, the scope of these human factors is not seen to be open to a
formal definition and an objective assessment by many specialist risk assessors.
This paper describes a “true” human factors integration into the safety/risk assessment
process for safety cases. The integration is based on clear principles and achieved by
identifying the critical path for the appropriate treatment of the various types of human
factors considerations throughout the risk assessment process.

2.0 The Basic Principles


If we are to adequately consider both the direct and indirect human factors with significant
impact on safety, we must understand the level at which they can be adequately treated within
an assessment process. An appropriate methodology must be applied to achieve a credible
and reliable result. This is the basic philosophy behind the methodology.
For safety cases, a clear strategy of approach that takes account of the safety assessment
system in use and the extent to which advanced risk assessment techniques are applied needs
to be established. Competency of the specialist assessors and reviewers, interdisciplinary
team practices and use of specialist contractors should be included when formulating the
strategy.
The identification of a critical path for the integration of a framework for the identification
and assessment of direct and indirect human factors embedded in the safety management
system and organisational domains should be unambiguous and transparent. The HFI process
must be compatible and harmonised with each level within the risk assessment process. To
produce a reliable deterministically-based assessment, an appropriate balance must be struck
between the qualitative and quantitative assessment and targets.
The probabilistic treatment of human error within the Probabilistic Risk Assessment
(PRA) must acknowledge the sparsity of good quality data and must validate the data, sources
and assumptions made.

3.0 Safety Management System and Organisational Frameworks


An important aspect of a safety case is the demonstration of the effectiveness of the Safety
Management System (SMS) and the organisation for safety. This aspect is often described
and not backed up by an assessment of their reliability. The key controlling elements within
the safety management system and organisation frameworks need to be included in the risk
assessment. Here, it is essential to make a distinction and appreciate the difference between
human error causing an immediate failure in the system and that which makes hazardous
events more probable.
584 JL Williamson-Taylor

For the purpose of a safety case, a powerful two stage technical audit procedure, such as
Critical Examination (CE) for safety assessment studies (Williamson-Taylor, 1995) can be
applied at the early stage of the risk assessment. The first stage is a broad scope technical
audit followed by the application of a set of specialist ergonomic audit tools to assess specific
aspects such as procedures and communications, training and supervision, human
and machine interface, safety culture, the environment, organisation for safety—
including decision processes for manning, safety related posts, technical support, use of
contractors etc.

4.0 The Human Factors Integration into Safety Assessment Methodology


The description of a risk assessment process for a hazardous facility integrating human
factors considerations is given below.

Preliminary Safety Review and Human Factors Considerations


The key to a competent treatment of both the direct and indirect human factors within a risk
assessment framework is one of identification. A methodology capable of high level
identification of the hazardous systems and operations, high impact human involvement in
process/systems operations and key SMS subsystems should be applied.
The screening of the output from the application of a methodology such as CE includes
the identified tasks associated with high hazards, high impact and vulnerable task activities,
SMS sub-systems which their functionality depends on high human success. Management
issues relating to general occupational safety and health rather than major accident hazards
can be appropriately filtered into the relevant management system for their adequate
treatment.

Formal Hazard Identification and Human Factors Identification


The human tasks, hazardous systems and operations and key SMS subsystems identified in
the previous stage are further assessed using dedicated techniques—HAZard and OPerability
study (HAZOP) and/or Failure Mode and Effect Analysis (FMEA) for process and hardware
systems; an appropriate combination of Task Analysis (TA) tools for specific human tasks;
and Human HAZOP for safety management sub-systems supported by appropriate TA
supplementary tools. Organisational factors and other factors having influencing effects on
human performance are assessed using dedicated ergonomic tools. The various assessments
are best performed in parallel. This will require good project management, planning and use
of competent multi-skilled assessors.
The output from these assessments is a schedule of faults and human error sequences and
conditions leading to events which can be either initiating or top events. The qualitative
screening of incredible faults and human error can be carried out using engineering and
expert knowledge, aided by a short-cut or coarse risk criteria.
Human Factors Integration into Safety and risk assessment systems 585

Risk Analysis Integrating Human Error Analysis


This stage involves the estimation of the potential consequence of a given event and the
probability of the event occurring to estimate the risk. Fault Tree and Event Tree analysis are
the most commonly used methods to provide the architecture of the relationship between the
top events and initiating events and conditions. Only a proportion of hazardous events will
require detailed quantification of the fault sequences and risk.
The architecture of the relationship between the immediate precursor to a top event and
immediate precursor to a sub-event that can lead to a top event provide a good qualitative
basis for examining the possible combination of initiating events for the minimum cut set
analysis. The quantification process should include human reliability quantification using
methods such as HEART (Kirwan, 1994) as an integral part of the process. Comparison with
a target for overall risk acceptability using ALARP principle and tolerability of risk criteria
would have included a well structured human factors considerations.
The deductions from the risk assessment are a set of technical guidelines, engineering and
management systems such as safe operating envelopes for processes and systems, key
operating procedures and minimum standards, mechanical integrity and maintenance
standards for safety critical and related systems, organisation for safety, emergency plans/
response and crisis management etc. These are SMS sub-systems run by people through the
organisational decision systems required to comply with the safety standard and risk level
defined by the safety case. The full scope of human factors for safety and continuous
improvement becomes inherent and demonstrable in the deduced systems.

The Demonstration of safety of a Hazardous Facility


The demonstration of safety and risk level will draw on the evidences from both the
qualitative and quantitative assessment and the reliability of the deduced supporting SMS.

Validation
The validation of the HFI methodology for safety assessment and safety cases involves the
evaluation of the quality and reliability of the methodology and verification of the correctness
of the tools within it. The detailed methodology upon which this paper is based (Williamson-
Taylor, 1996) was independently validated on this basis. Pilot and real applications have
demonstrated the methodology’s ability to achieve its objective.

5.0 Conclusion
Human factors considerations should address the full scope of all significant human
influences within a safety case of a hazardous installation. A structured methodology such as
the one upon which this paper is based is required. This methodology demonstrates that
human factors in safety management systems and organisational domains can be integrated
into a safety case.
586 JL Williamson-Taylor

References
Anderson, N, Schurman, D and Wreathall, J 1990, A structure of influences of management
and organisational factors on unsafe acts at the job promoter level, Proceedings of the
Human FactorsSociety 34th Annual Meeting, Orlando Fl. 8–12 October 1990, 881–
884 (The Human Factors Society , Santa Monica, CA, USA).
Joksimovich,V. Orvis, D and Moien. P 1993, Safety Culture via Integrated Risk Mangement
Programme. Proceedings of the Probabilistic Safety Assessment International Topical
meeting, Clearwater beach, Fl. January 26–29, 220–226 (American Nuclear Society,
La Grange Park, IL. USA).
Kirwan B, 1994 A Guide to Practical Human Reliability Assessment (Taylor and Francis,
London).
Pate-Connell, M 1990, Organisation aspects of Engineering Systems Safety: The case of
offshore platforms, Science 250:1210–1217
Williamson-Taylor, J. 1995, Critical Examination Methodology for safety case preparation
(AWE plc—internal publication)
Williamson-Taylor, J., 1996, Human Factors Integration into safety assessment processes
and safety cases. (AWE plc—internal publication)
THE USE OF DEFIBRILLATOR DEVICES
BY THE LAY PUBLIC

Tracy Gorbell and Rachel Benedyk

Ergonomics and HCI Unit


University College London
26 Bedford Way
London WC1 0AP

A defibrillator is a device used to apply an electric shock to a patient’s chest


to stop the haphazard activity of the heart that results in cardiac arrest. It is
proposed that the lay public could apply defibrillation in advance of the
paramedic’s arrival. To investigate several ergonomic aspects of novice use
of the device, sixty-one subjects were recruited into one of two experimental
designs, with or without training. Four performance measures were assessed
using real defibrillators on a simulated task. The results of the study show
that performance was better for those subjects with training. Times to
defibrillation depended on the usability of particular features. Resuscitation
experience did not appear to influence the results. Lay people could
successfully defibrillate with optimal designs of device.

Introduction
Successful resuscitation from cardiac arrest due to heart attack depends on what is commonly
known as the ‘chain of survival’ (Bossaert and Koster, 1992; Resuscitation Council UK,
1994). The chain is characterised by four links; 1) early access, 2) early cardiopulmonary
resuscitation (CPR), 3) early defibrillation and 4) early advanced care. The degree of success
achieved in resuscitating an individual requires the rapid application of this chain in an
emergency situation.
Typically the first two links have been taught to the lay public through First Aid and CPR
courses. As part of a wider initiative to improve the chances of survival from sudden cardiac
death, several authors have supported the idea of extending the role of the lay public to
include stage three, defibrillation (Weisfeldt et al, 1996; Bossaert and Koster, 1992).
Implementation of automated external defibrillators (AEDs) into the community has been
suggested, amongst others, for densely populated areas such as airports. The analogy put
forward, is that AEDs could become like ‘fire extinguishers’. However, is this what is really
588 T Gorbell and R Benedyk

meant by such an analogy, that anyone could pick up an AED and use it in a life-threatening
situation without training?
Although technology used to analyse cardiac rhythms is not new, actual advances in AED
design have increased the possibility of their use by lay individuals. AEDs aimed at public
access are portable, maintenance-free, provide audible and/or visual prompts and require no
recognition of complex heart rhythms by the user.
Whilst differences in design will determine the exact sequence of use, operating an AED
involves several key actions. Turning the AED on, verbal and/or visual instructions prompt
the operator to connect the electrodes. The electrodes are positioned as indicated on the
electrode packaging (or AED itself) to the patient’s exposed chest. Analysis of the heart
rhythm is then initiated either automatically or through pressing an ‘analyse’ button. The
AED then decides whether it is necessary to administer a shock. If advised, the operator
delivers the shock by pressing the appropriate button. Prior to each shock, the operator is
required to make both a visual check of the area and give a command to “stand clear”. The
safety actions are important to ensure that no one is inadvertently shocked. If a shock is not
advised, the operator is prompted to commence CPR.
Given, that a role for others outside of the medical profession exists, who are first
responders and what does the term lay person mean? First responders can be characterised as
those individuals who as part of their profession are likely to be first on scene in an
emergency e.g. the police. Lay person responders on the other hand could be distinguished as
those individuals who are responsible for others within their work domain e.g. airline cabin
crew. The common factor amongst these groups, is that they have some degree of training.
However, if the analogy of a fire-extinguisher is applied to AED use, then a further category
of users arises, the untrained, non-medical public.
One of the main advantages of improved defibrillator technology is that less
knowledgeable individuals can be taught how to use an AED. Weisfeldt et al (1996) suggest
that the training requirements of the public “depend on whether the objective is familiarity
with the concept or ease in use of the device”.
The introduction of unfamiliar devices and techniques previously considered outside of
the lay public domain, requires both careful thought and implementation. Ergonomic design
challenges include how to package an AED so that it communicates both its purpose and
operation to even a lay user. At the same time the design should be such that it can be used
effectively and safely. Furthermore, it is not only necessary to look at the individual design
features of a system or device, but to consider the device in the full context in which it will be
used. For this, a way of identifying possible intervention strategies that guard against error
and optimise design are needed (Benedyk and Minister, 1997).
The purpose of this study was to investigate several ergonomic aspects of AED use by the
lay public. Several key areas were identified. These included whether resuscitation experience,
training versus non-training and certain features of AED design would affect the user’s use of
AED units. Finally, does training in one AED transfer to a second, different AED?
Use of defibrillator devices by the lay public 589

Method
Sixty-one subjects were recruited into one of two experimental designs, with training or
without training. Two AED designs were used and four performance measures were assessed
using a real AED on a simulated task. All subjects were asked to complete a questionnaire
regarding the usability of the AED units.

Experiment 1—With Training


Thirty-two subjects participated in experiment 1 (16 CPR-trained individuals and 16
novices). Subjects were divided into four groups so that both a CPR-trained and novice group
received training with either AED-A or AED-B. Each group received a two hour training
session which included a lecture, practice time and assessment of subject’s AED
performance. A modified section of the advisory external defibrillator protocol (as
recommended by the London Ambulance Service Steering Committee) was used in the
experiment (Figure 1). Four performance measures (Figure 2) were assessed using a
performance checklist. A week later, each group was re-assessed using the same scenario. On
this occasion, half of the subjects within each group were re-assessed using the same AED.
The other half were re-assessed using a different AED unit. This was considered to be of
interest because of future implications for training programmes and the possibility for
standardisation of public access defibrillator units.

Figure 1. Modified section of defibrillation protocol followed in scenario

Experiment 2—Without Training


Twenty-nine subjects (10 paramedics, 9 CPR-trained individuals and 10 novices) participated
in experiment 2. Each subject was asked to try and deliver three shocks using both AED units
without prior instruction. Half of the subjects in each population used AED-A followed by
AED-B. The others did the reverse. The scenario and assessment of performance was the
same for that described in experiment 1.
Unless otherwise stated, the Wilcoxon Rank Sum test was used to analyse the data from
both experimental designs.
590 T Gorbell and R Benedyk

Figure 2—Description of performance measures recorded

Results and Discussion

Experiment 1
With training no significant differences (at the 5% level) in performance scores were
observed between novice and CPU-trained subjects. To incorporate novice subjects, the
scenario used (Figure 1) did not require subjects to perform CPR. Thus with training both
groups could perform the task set in the experiment. Resuscitation experience did not appear
to have an additional advantage.
Whilst AED action and safety scores did not differ between the two units, times to
defibrillation were significantly quicker (at the 5% level) for subjects trained to use AED-B.
With AED-A, the electrodes were not pre-connected to the main unit thus additional steps
had to be completed by the user to connect the electrode lead. Furthermore after the first
analysis cycle and subsequent shock, the sequence returns to the point of pushing the analyse
button. In contrast, AED-B automatically re-analyses the patient’s heart rhythm. Times to
defibrillation are therefore intrinsic to the AED unit.
When subjects were re-tested using a second unfamiliar defibrillator, differences in
performance were observed. Using a paired sample t-test, times to defibrillation were significantly
slower (at the 5% level) for subjects trained on AED-B and re-tested on AED-A. In contrast, for
those trained on AED-A and re-tested on AED-B, times to defibrillation were not significantly
better or worse (at the 5% level). AED action points were lost for both novice and CPR-trained
groups and for both units. Interestingly, it was always action 4 (correct positioning of electrodes)
where points were lost. Whilst the positioning of the electrodes is the same, the shape, size and
labelling of the electrodes differ between the two units.
In general, it appeared that performance deteriorated if subjects were re-tested using the
more complex of the two AED units. Standardisation of future AED designs would be
advisable. However, because several companies manufacture AEDs, a more practical
approach may be to standardise certain features. For example, all AEDs have the same pre-
connected electrodes.
Use of defibrillator devices by the lay public 591

Experiment 2
Without training, times to defibrillation did not significantly differ (at the 5% level) between
the three subject groups using either AED unit. In contrast, safety scores were significantly
better (at the 5% level) for the expert group when compared with the novice and CPR-trained
subjects. Points were consistently lost for safety actions 2 and 3 (Figure 2). Simply, the
untrained lay subjects were not aware of the required safety actions. Although no significant
differences (at the 5% level) in AED action scores were found between the subjects, all
groups consistently lost points associated with AED action 4. As with the trained subjects,
resuscitation experience did not have an effect. More specifically, it was prior experience of
defibrillation that affected the results.
Comparisons between the three subject groups using both AED units demonstrated that
times to defibrillation were again, significantly quicker (at the 5% level) using AED-B.
If, as suggested, the analogy of the fire extinguisher is applied to AEDs, taken literally,
this would imply that AED use would not require training. The results of this study however,
demonstrated that significant differences (at the 5% level) in performance did exist between
trained novice and CPR-trained subjects when compared with their non-trained counterparts.
The untrained groups took the longest times to defibrillation and lost the most safety points,
thus highlighting the need for some instruction. As public access defibrillation will be an
unfamiliar concept to many, in the absence of experience, the provision of a good conceptual
model can be addressed through training.

Conclusion
In conclusion, all subjects were able to use an AED. Differences in performance suggests that
lay individuals will need training. Although AED design in general supported ease of use, to
improve upon existing designs, issues relating to usability need to be driven from the end-
users point of view. Ergonomic research can therefore support that conducted by the medical
and engineering professions.
As these results were obtained from small samples of data, further investigation in this
area is suggested.

References
Benedyk R and Minister S, 1997. Evaluation of Product Safety Using the BeSafe Method.
IN Stanton N, (ed). Human Factors in Consumer Products. Taylor and Francis, 1997
In press
Bossaert L and Koster R, 1992. Defibrillation: Methods and Strategies. Resuscitation (24)
211–225
Resuscitation Council UK, 1994. Advanced Life Support Manual, 2nd Ed. Burr Associates
Weisfeldt ML, Kerber RE, McGoldrick P, Moss AJ, Nicol G, Omato JP, Palmer DG, Riegel
B and Smith SC, 1996. American Heart Association Report on the Public Access
Defibrillation Conference December 8–10, 1994. Resuscitation (32) 127–138
OCCUPATIONAL DISORDERS IN GHANAIAN
SUBSISTENCE FARMERS

Marc McNeill 1 and Dave O’Neill2

1Department For International Development, 94 Victoria Street, London


SW1E 5JL
2Silsoe Research Institute, Silsoe, Bedford MK45 4HS

A survey of 100 (male) subsistence farmers in the Brong Ahafo region of


Ghana was undertaken to identify the predominant causes of ill-health in this
sector of the population. Injuries from cutlass accidents and back pain were
found to be prevalent (79% and 76% respectively), with back pain being the
more debilitating accounting for, on average, 19 days lost from work. A
greater number of working days were lost from gunshot wounds (60), broken
bones (38) and snakebites (29), but these were less prevalent. The use of
handtools was heavily implicated in many of the activities associated with
the onset of ill-health. It is concluded that improved designs of handtools
could increase the farmers’ productivity and quality of life.

Introduction
In Ghana agriculture accounts for 47.8% of GDP, employs about 60% of the total labour
force and contributes 70% of total export earnings (GSS, 1994). The majority of this is small
scale subsistence farming where manual labour contributes an estimated 90% of the energy
used for crop production (FAO, 1987). The full potential of this energy is often not realised,
with the workers physical capacity being reduced because of ill health from occupational
disorders; diseases or injuries attributable to work practices, work demands or the work
environment, (Rainbird and O’Neill, 1993).

A perception that occupational health is solely an industrial concern and that health and
safety issues are less of a problem to the agricultural sector than the industrial sector seems to
persist (Mohan 1987). Whilst some research has been conducted into occupational disorders
in industrially developing countries, very little has focused upon agriculture. Rainbird and
O’Neill (1993) in their review of occupational disorders affecting agricultural workers in
tropical developing countries grouped agricultural occupational disorders into three broad
categories: health problems associated with pesticides, musculoskeletal disorders, and
occupational diseases such as zoonoses and farmer’s lung. They specifically excluded
Occupational disorders in Ghanaian subsistence farmers 593

occupational accidents that may be a significant cause of lost productivity in agriculture.


Nogueira (1987) described a survey of agricultural accidents carried out in Brazil where
9.22% of workers suffered accidents at work, of which 45.98% were caused by handtools. In
Ghana where most farming activities are carried out using hand tools, the incidence of
injuries from handtools may be expected to be greater.

A participatory rural appraisal (PRA), along the lines described by O’Neill (1997), conducted
with farmers in the Brong Ahafo (BA) region of Ghana, suggested that accidents, injuries and
illnesses as a result of agricultural activities are not uncommon. In particular, musculoskeletal
disorders were identified as a problem with a majority of farmers complaining of lower back
pain. Injuries from hand tools were common, farmers claiming that lacerations from slashing
the bush with cutlasses or weeding with hoes were a regular hazard. Other occupational disorders
that farmers claimed to suffer from included thorn pricks, from weeds such as Acheampong
(Chromolaena Odorata) and Speargrass (Imperata Cylindrica), gunshot wounds and fever from
working in the sun. Occupational disorders from post-harvest agro-processing activities, which
are mostly carried out by women, were also found to be common. These usually involve much
drudgery, with repetitive upper body motions (e.g., stirring, kneading, pounding) in unpleasant
environments (e.g., smoke, dust). For a more detailed account of occupational health in agro-
processing, refer to Fajemilehin and Jinadu (1995).

Discussions with medical personnel in clinics and hospitals in the Wenchi district of BA and
with traditional herbalists supported the hypothesis that occupational disorders are a problem
for farmers. Whilst malaria is by far the most common cause of morbidity and admission to
hospital in Wenchi district it is by no means the only cause. In 1996, accidents (trauma and
burns) were the sixth most common cause of morbidity (Antwi, 1997). Whilst there is no
indication as to the causes of these, the medical personnel and herbalists suggested that agricultural
accidents may be the most frequent cause. From the records at one hospital, morbidity amongst
farmers that may be related to occupation (such as trauma, lower back pain and snake bites)
accounted for approximately 11% of all cases seen. Given the apparently often hazardous nature
of many of the activities with which they are involved, this study aimed to establish how Ghanaian
subsistence farmers are affected by occupational disorders.

Methodology
From earlier PRA work with farmers and discussions with health personnel, a questionnaire
was constructed covering the major occupational disorders that had been identified. It was
piloted before being incorporated into a larger survey of farming practices in the Wenchi
district. Whilst women are also farmers, (and indeed their burden of agricultural work may be
greater, undertaking activities such as agro-processing, water and firewood collection along
with tending to the farm), the logistics of this limited survey prevented them from being
included. Hence the questionnaire was administered to farmers (predominantly male heads of
household) in four villages in the Wenchi district. A total of 100 farmers from 168 households
were interviewed.
594 M McNeill and DH O’Neill

Results and Discussion


Table 1 provides a summary of the days lost and the costs of disorders, for various activities
from information collected over two cropping seasons (ie one year).

Table 1. Mean costs and days lost from occupational disorders

Musculoskeletal disorders
Back pain was suffered by 76% of the farmers. This may be related to extended periods of
hard work in awkward postures that are observed during many agricultural activities. Indeed
all the activities that were attributed to causing back pain (Figure 1) are traditionally
undertaken using short handled hoes and cutlasses that necessitate a stooping posture. Several
farmers claimed they were unable to work for long periods with chronic back pain. The mean
number of days lost from back pain was 19 days.

Complaints of chest pain were made by 42% of the farmers in the last two cropping seasons.
The most commonly cited cause of this was from making yam mounds; this activity involves
the farmer bending over, using a short handled hoe to move soil between his legs creating a
mound approximately 0.5m high.

An opportunity sample of 40 farmers from the original 100 were also asked whether they
were suffering from lower back pain now and whether they had suffered lower back pain in
the last year. The point prevalence was 48%, whilst 77% of farmers claimed they had suffered
from lower back pain in the last year.
Occupational disorders in Ghanaian subsistence farmers 595

Figure 1. Activities attributed to causing back pain

Hand tools
The cutlass is a multi-purpose tool, being used in clearing the bush (slashing and cutting),
planting (digging holes with the blade end), weeding (turning over the soil with the blade
end) and harvesting (cutting and digging). Over the two cropping seasons the most common
occupational disorder affecting farmers was cutlass injury. The weight of the cutlass and its
handle design may be important factors in the incidence of accidents using cutlasses. Hoes
are predominantly used during land preparation (i.e., making mounds) and during weeding.
As well as being associated with musculoskeletal disorders, 42% of farmers claimed they had
sustained an injury from hoeing.

Burns and fever


Farmers burn their land during the dry season to clear the soil for planting and to rid the land
of weed seeds and pests. Fires are also lit by hunters to drive out animals. With intensified
cultivation, longer dry seasons and increasing spread of grasses the fires can easily get out of
hand. Thus, burns are common, with 50% of farmers claiming to have been injured,
predominantly during the dry season. Burns may not be the only health hazard from bush
fires: almost 74% of fevers during the dry season were attributed to burning. Whilst many of
these fevers may be malarial (farmers do not discern between malaria and any other fever) it
is suggested that they may be the symptoms of upper respiratory problems from smoke and
dust inhalation, or heat stress and heat related illnesses.

Pesticide problems
The results from this survey indicate that 28% of farmers had suffered from sickness
following chemical use. This is an indication of acute pesticide poisoning rather than the
effects of long-term exposure to pesticides that would require objective, or clinical, analysis
such as inhibition of cholinesterase activity to reveal (Rainbird and O’Neill, 1993). Several
reasons for the incidence of pesticide poisonings are suggested in Table 2.
596 M McNeill and DH O’Neill

Table 2. Suggested reasons for incidence of pesticide poisonings

Snake bites and injuries from plants


Acheampong is a common weed that is claimed to have medicinal properties and is used in
the preservation of corpses. When it is dried, however, the sharp ends are thought to be
poisonous and present a significant hazard of injury and infection: injuries from Acheampong
were reported by 69% of the farmers. Speargrass is also a hazard with the risk of lacerations
and puncture wounds. Discussions with farmers suggested that these injuries occur mainly
around the feet and ankles. Snake bites, which are universally feared, also occur around the
lower legs. Many farmers wear Wellington boots to protect themselves from these hazards,
however they are expensive and inappropriate for the tropical environment. There is,
therefore, an apparent need for comfortable, low cost leg protection.

Conclusions
The results of this survey have indicated that occupational disorders are a major problem in
Ghanaian subsistence agriculture. Injuries from hand tools, musculoskeletal complaints (back
pain) and fever that is attributed to work are the most common. The immediate to the cost to
the farmers both in terms of lost work and the financial burden of treatment, be it traditional
or allopathic can be considerable. When farmers only have a limited window, dictated by
climatic changes, in which to undertake certain activities, an injury or illness that is sustained
at these times can have serious consequences in the success of the crop.

Many of the occupational disorders identified in this study could benefit from improvements
following a participatory ergonomics approach. For example whilst the cutlass and hoe are
the traditional tools used by subsistence farmers it is apparent that they cause many injuries,
and the posture required to use them may be a contributing factor in the high incidence of
back pain. Nwuba and Kaul (1986) investigated the biomechanical and physiological aspects
of using short and long handled hoes. They found the short-handled hoe exerted considerable
spinal muscle force and associated this with the “sharp pains low in the back when hoeing”. It
also had a 64% greater demand in terms of work rate and 51% greater energy expenditure per
unit of volume soil moved when compared with the long-handled hoe. Yet, whilst such
improvements as long-handled hoes may appear to be beneficial to farmers, there may be
cultural or traditional reasons why an ergonomics intervention may be resisted. Freivalds
(1987) suggested that a lack of impetus for changing tool design is a resigned view arising
Occupational disorders in Ghanaian subsistence farmers 597

from the belief that that no further improvement is possible to a tool which has been used by
many people for many years. Johnson and O’Neill (1979) noted that many attempts have been
made to introduce improved tools, such as scythes into Africa, but have mostly failed. They
suggested that the main reason, in broad terms, was that a participatory approach had not
been taken. By introducing a participatory, multi-disciplinary ergonomics approach to the
causal factors of the occupational disorders identified in this paper, it is considered that
accidents, injuries and ill-health can be reduced. This will result in raised work capacity,
improved health and higher productivity (Elgstrand, 1985).

Acknowledgement
This paper is an output from a project funded by the UK Department for International
Development (DFID) for the benefit of developing countries. The views expressed are not
necessarily those of the DFID.

References
Antwi, Y, 1997, Personal communication. (District director for health services, District
Health Services, Ministry of Health, Wenchi District).
Elgstrand, K., 1985, Occupational safety and health in developing countries. American
Journal of Industrial Medicine , 8, 91–93.
Fajemilehin, B.R. and Jinadu, M.K. 1995. African Newsletter on Occupational Health and
Safety, 5, 38–39.
FAO, 1987, African Agriculture: The next 25 years. Food and Agriculture Organisation,
Rome.
Freivalds, A., 1987, The ergonomics of tools, In D.J.Oborne (ed), International Reviews of
Ergonomics, 1, 43–75 (Taylor and Francis, London).
GSS, 1984, Quarterly digest of statistics, Ghana Statistical Service.
Johnson, I.M. and O’Neill, D.H., 1979, The role of ergonomics in tropical agriculture in
developing countries Ergonomics in Tropical Agriculture and Forestry-Proceedings of
the 5th Joint Ergonomics Symposium, Wageningen, 125–129.
Mohan, D., 1987, Injuries and the poor worker. Ergonomics, 30(2), 373–377.
Nogueira, O.P., 1987, Prevention of accidents and injuries in Brazil, Ergonomics 30(2),
387–393.
Nwuba, E.I.U. and Kaul, R.N., 1986, The effects of working posture on the Nigerian hoe
farmer , J.agric.Engng Res. 33, 179–185.
O’Neill, D.H., 1997 Participatory ergonomics with subsistence farmers. In S.A. Robertson
(ed), Contemporary Ergonomics 1997, Proceedings of the Ergonomics Society 1997
Annual Conference, Taylor and Francis Ltd, London, 232–237.
Rainbird, G. and O’Neill, D, 1993, Work-related diseases in tropical agriculture: A review of
occupational disorders affecting agricultural workers in tropical countries. Silsoe
Research Institute.
AUTHOR INDEX
600 Author index

Alexander, P. 87 David, H. 429


Andersen, D.M. 2, 41 Davies, I.R.L. 295
Apperley, M. 274 Dempsey, P.G. 503
Atkinson, G. 208 Desmond, P.A. 451
Atkinson, T. 404 Devereux, J. 25
Dickens, A. 198, 213
Baber, C. 198, 213, 338 Dickinson, C. 36, 46
Baker, N. 515 Dillon, J. 546
Banbury, S. 482 Donohoe, L. 404
Banks, G.M. 419 Donovan, K.J. 535
Barbour, R. 274 Donnelly, D. 424
Barzegar, R.S. 311 Duggan, C. 376
Benedyk, R. 587 Durham, S.L. 66
Bethea, D. 520
Beynon, C. 56 Edlund, G. 186
Bezverkhny, I. 503 Edmonds, J. 376
Birkbeck, A.E. 398 Edworthy, J. 258, 316
Bonner, J.V.H. 253 Elder, J. 565
Bourgeois-Bougrine, S. 429 Esnouf, A. 140
Bouskill, L.M. 510, 540
Brigham, F.R. 8 Faiks, F.S. 113
Broek, J.J. 248 Fearnside, P. 409
Bruijn, O.de 285 Fernandez, J.E. 492, 498
Buckle, P.W. 21 Finch, M.I. 388
Burgess-Limerick, R. 123 Fredericks, T.K. 492
Burton, A.K. 30
Gale, A.G. 61, 456
Cabon, P. 429 Genaidy, A.M. 241
Campion, S. 179 Goillau, P.J. 419
Carter, C. 191 Gorbell, T. 587
Cartwright, S.A. 96 Gough, T.G. 236
Chambers, S. 51 Graham, R. 269, 441
Charles, P. 366 Graves, R.J. 51, 162, 560
Clarke, A. 179 Gray, M. 46
Clift-Matthews, W. 316 Green, W.S. 360
Code, S. 186 Griffin, M.J. 487
Coldwells, A. 208
Cotnam, J. 503 Haigney, D. 466
Cowieson, F. 92 Hamilton, W.I. 366
Cox, T. 174 Harrison, R.F. 213
Crawford, J.O. 101, 530 Harvey, R.S. 13
Crick, J. 295 Haslam, R.A. 66, 77
Crowther, M. 316 Haslegrave, C.M. 343, 471
Curry, M.B. 285 Haward, B. 135
Author index 601

Hellier, E. 321 Maguire, M.C. 269


Hoekstra, P.N. 248 Majumdar, A. 414
Hone, K. 174 May, A. 191
Hook, M. 409 May, J.L. 61, 456
Hooper, R.H. 108 McCaig, R. 46
Humphreys, N. 525 McConnell, A.K. 535
Huston, R. 241 McDougall, S.J.P. 285
McGorry, R. 503
Jackson, J. 118 McNeill, M. 592
Jafry, T. 556 Milne, T.J. 530
Jamieson, D.W. 162 Mollard, R. 429
Johnson, D.M. 424 Mon-Williams, M. 123
Jones, D. 482 Morris, L.A. 46
Jong, A.M. de 355
Jordan, P.W. 264, 551 Neary, H.T. 203
Nevill, A. 56
Kanis, H. 360, 577 Nicholls, J.A. 82
Kattel, B.P. 498 Nichols, S. 146
Kelly, C.J. 419 Nicholson, P. 409
Kerrin, M. 174 Noyes, J.M. 306, 424
Kilner, A.R. 409
Kirwan, B. 280, 404 Oostendorp, H.van 333
Klein, D. 360 O’Neill, D.H. 476, 556, 592
Kvålseth, T.O. 572
Paddan, G.S. 487
Lamoureux, T. 404 Pallant, A. 156
Lancaster, R.J. 167 Parker, C. 290
Lane, R.M. 101 Parsons, K.C. 510, 520, 525, 540
Langan-Fox, J. 186 Peijs, S. 248
Langford, J. 381 Phillips, A. 404
Layton, S. 565 Piras, M. 295
Leaver, R. 51 Plooy, A. 123
Lee, S. 20 Ponsonby, J. 560
Leighton, D. 56 Porter, J.M. 140
Leung, A.K.P. 321 Porter, M.L. 371
Life, M.A. 82 Powell, C. 151
Lindsay, J. 338
Livingston, R. 540 Rainbird, G. 156, 381
Llewellyn, M.G.A. 108 Rayson, M.P. 393
Lomas, S.M. 471 Reed, S. 258
Reeves, C. 151
Macdonald, A.S. 264, 551 Reid, F. 258
Mackay, C. 46 Reilly, T. 56, 96, 208
MacKendrick, H. 404 Reinecke, S.M. 113
MacLeod, I.S. 225 Robertson, D. 565
602 Author index

Robinson, B.J. 476 Tilbury-Davis, D.C. 108


Rooden, M.J. 328 Totter, A. 300
Ross, T. 446
Ryan, B. 343 Van Schaik, P. 253
Vaughan, G.M.C. 220
Schaefer, W.F. 355 Verbeek, M. 333
Selcon, S. 295 Vink, P. 355
Sharma, R.M. 130
Shaw, T. 46 Waterhouse, J. 208
Sheldon, N. 510 Watson, N. 46
Shell, R. 241 Webb, L.H. 525
Siemieniuch, C.E. 220 Wikman, J. 230
Sinclair, M.A. 203, 220 Wilkinson, A. 51
Somberg, B.L. 350 Williamson-Taylor, J.L. 582
Stanton, N. 436 Wilson, T. 461
Starr, A.F. 306 Withey, W.R. 510, 540
Stary, C. 300 Wogalter, M.S. 311
Stedmon, A.W. 388 Woodward, V.G. 419
Stewart, T. 3 Wright, E.J. 77
Stubbs, D. 20
Sturrock, F. 280 Yeo, A. 274
Young, M. 436
Taylor, R.G. 466
Tesh, K.M. 72 Zajicek, M.P. 151
Thornton, G. 118 Zhu, F. 515
SUBJECT INDEX
604 Subject index

adaptation 466
age 208, 461
agriculture 290, 592
air-to-air combat 295
air traffic management 404, 409, 414, 419, 429
allocation of functions 213, 220
3D anthropometry 248
attention 456
attitudes 146
audit, ergonomic 371
audit, stress 167
auditory distraction 482
automatic speech recognition 441
automation 436
automotive industry 51
avionics 306

backrest 113
biomechanics 30
blind users 151
BSI 3

case study 77, 179, 203, 366, 371


CEN 3
chairs, see seating
children 92
Chinese 321
clothing 510, 520, 540
cognition 225
cognitive representation 230
cognitive walk-through 333
collaboration 258
comfort 140, 525
commercial planning 546
communication 191, 198, 230
concurrent engineering 191
constraints 565
construction 355
consumer products see products
containerisation 381
CSCW 191
cultural issues 274
Subject index 605

dancing 66
decision support 290, 424
design 8, 225, 264, 269, 565
design needs 560
disability 30, 179
discomfort 471
display design, see interface design
drink distribution 77, 360
driver behaviour 461, 466
drivers, see driving
driving 436, 441, 446, 451, 456, 461, 471

education 130, 551


elderly drivers 461
engineers 560
engineering design 258
equipment design 587
ergonomic application 366, 371, 376
ergonomic intervention 21, 135, 350
error, human 456
error, pilot 424
evaluation 300, 419
evaluation methods 253

fatigue 429, 451


financial planning 546
fire fighters 530, 535
fuzzy logic 241

gloves 487
guidelines 130, 446, 556

hand-arm vibration, see vibration


hand tools 503
hazard 316, 321
HCI 156
health and safety 130, 174, 236, 520
see also occupational health, safety
heart rate 530, 535
heart surgery 338
heat 530
HMI integration 446
hospitals 56, 82, 87, 162, 167, 338
606 Subject index

ice cream 503


icons 285
in-vehicle telematics 446
industrially developing countries 592
infantryman 398
information systems design 236
information theory 572
intelligent systems 241
intelligent transportation systems 441
interface design 253, 280, 285, 290, 295, 333
Internet, the 156
ISO 3

job design 203, 213, 220, 376


job efficiency 198
job satisfaction 198, 213

kinematic 113
knee flexion 108
knowledge requirements 280

legislation 72, 236


lifting see manual handling

mail processing 381


manual handling 72, 77, 82, 87, 92, 96, 101, 118, 343, 360
management of change 203
management, line 51
marketing 546
material 525
medical equipment 587
mental models 186, 404
mental workload 436
methodology 36, 179, 203, 220, 236, 253, 328, 343,
350, 376, 565
microscopes 61
military 295, 388, 393, 398
models 248, 572
workload 414
predictive 409
see also mental models
mouse 135
movement 92
musculoskeletal disorders 21, 25, 30, 36, 41, 46, 51, 56, 56, 61, 66
135, 371, 492, 498
Subject index 607

navy 13
neck 123
noise 482
nursery carers 101
nurses 56, 87, 162
see also hospitals

occupational health 41, 355, 393, 592


see also health and safety
offices 130, 140, 482
organisational change 350
orthotics 108
OWAS 101

pain 41
participatory ergonomics 355
patient-handling 82
perfusion 338
personal protective clothing 471, 520
personnel selection 393
physical risk factors 21, 25
physical activity 208
police 471
policy 556
posture 61, 101, 123, 248
psychophysical 492
psychosocial risk factors 25, 30
pregnancy 96
product semantics 264
products 8, 253
prototype, rapid 248
prototype, virtual 419

questionnaire design 146

railways 381
reaction time 572
research 46
rivet guns 498
risk assessment 51, 56, 72, 167, 174, 366, 582
risk management 87
608 Subject index

safety 366, 381, 476, 582


see also health and safety
safety management systems 582
sculpturing robot 248
seat belts 476
seating 113, 140
self contained breathing apparatus 535
self report 343
shared work spaces 258
sheet printing 371
shift work 208
signal words 311
simulator 338
situation awareness 306, 424
software usability 274
sound levels 311
speech 151
speed 466
spine 113
standards 3, 8, 13, 393, 487
strategy 46
stress 162, 167, 388, 451
suitability for tasks 300
support systems 198
survey 269, 560
symbols 8
syntax 577
system requirements 225

task analysis 230, 300


task demands 162
team working 186, 191
teleworking 174, 179
thermal comfort 515, 525
thermal environments 510, 515, 520, 530, 540
thermoregulatory model 515
trackerball 135
tractors 476
training 82, 560, 587
trunk asymmetry 118

usability 264, 269, 274, 253, 333


usage centred design 360, 577
user trials 328, 360
Subject index 609

validation 577
vehicle design 471, 476
verbal protocol analysis 328, 333, 338, 343
vibration 487, 492, 498
virtual reality 146
virtual teams 191
vision 123
visual strain 61

walking 108
warnings 306, 311, 316, 321
wheelchair 525
workload 376, 409
work process analysis 350
work systems 241
World Wide Web 151, 156
WRULDs 135, 174, 503

You might also like