80
Quality in Health Care 1995;4:80-89
Understanding adverse events: human factors
James Reason
Department of
Psychology,
University of
Manchester,
Manchester M13 9PL
James Reason, professor
A decade ago, very few specialists in human
factors were involved in the study and
prevention of medical accidents. Now there are
many. Between the 1940s and 1980s a major
concern of that community was to limit the
human contribution to the conspicuously
catastrophic breakdown of high hazard enterprises such as air, sea, and road transport;
nuclear power generation; chemical process
plants; and the like. Accidents in these systems
cost many lives, create widespread environmental damage, and generate much public and
political concern.
By contrast, medical mishaps mostly affect
single individuals in a wide variety of healthcare institutions and are seldom discussed
publicly. Only within the past few years has the
likely extent of these accidental injuries become
apparent. The Harvard medical practice study
found that 4% of patients in hospital in New
York City in 1984 sustained unintended
injuries caused by their treatment. For New
York state this amounted to 98 600 injuries in
one year and, when extrapolated to the entire
United States, to the staggering figure of 1-3
million people harmed annually- more than
twice the number injured in one year in road
accidents in the United States.' 2
Since the mid-i 980s several interdisciplinary
research groups have begun to investigate the
human and organizational factors affecting
the reliability of healthcare provision. Initially,
these collaborations were focused around the
work of anaesthetists and intensiviStS,3 4 partly
because these professionals' activities shared
much in common with those of more widely
studied groups such as pilots and operators
of nuclear power plants. This commonality
existed at two levels.
* At the "sharp end" (that is, at the immediate
human-system or doctor-patient interface)
common features include uncertain and
dynamic environments, multiple sources of
concurrent information, shifting and often ill
defined goals, reliance on indirect or inferred
indications, actions having immediate and
multiple consequences, moments of intense
time stress interspersed with long periods
of routine activity, advanced technologies
with many redundancies, complex and often
confusing human-machine interfaces, and
multiple players with differing priorities and
high stakes.5
* At an organisational level these activities are
carried on within complex, tightly coupled
institutional settings and entail multiple
interactions between different professional
groups.6 This is extremely important for
understanding not only the character and
aetiology of medical mishaps but also for
devising more effective remedial measures.
More recently, the interest in the human
factors of health care has spread to a wide range
of medical specialties (for example, general
practice, accident and emergency care, obstetrics
and gynaecology, radiology, psychiatry, surgery,
etc). This burgeoning concern is reflected in
several recent texts and journal articles devoted
to medical accidents7`9 and in the creation of
incident monitoring schemes that embody
leading edge thinking with regard to human
and organisational contributions.9 One of the
most significant consequences of the collaboration between specialists in medicine and in
human factors is the widespread acceptance
that models of causation of accidents developed for domains such as aviation and nuclear
power generation apply equally well to most
healthcare applications. The same is also true
for many of the diagnostic and remedial
measures that have been created within these
non-medical areas.
I will first consider the different ways in
which humans can contribute to the breakdown of complex, well defended technologies.
Then I will show how these various contributions may be combined within a generic
model of accident causation and illustrate its
practical application with two case studies of
medical accidents. Finally, I will outline the
practical implications of such models for
improving risk management within the healthcare domain.
Human contribution
A recent survey of published work on human
factors disclosed that the estimated contribution of human error to accidents in
hazardous technologies increased fourfold
between the 1960s and '90s, from minima of
around 20% to maxima of beyond 90%. 10 One
possible inference is that people have become
more prone to error. A likelier explanation,
however, is that equipment has become more
reliable and that accident investigators have
become increasingly aware that safety-critical
errors are not restricted to the "sharp end."
Figures of around 90% are hardly surprising
considering that people design, build, operate,
maintain, organism, and manage these systems.
The large contribution of human error is more
a matter of opportunity than the result of
excessive carelessness, ignorance, or recklessness. Whatever the true figure, though, human
behaviour - for good or ill - clearly dominates
the risks to modern technological systems medical or otherwise.
Not long ago, these human contributions
would have been lumped together under the
catch all label of "human error." Now it is
apparent that unsafe acts come in many forms
- slips, lapses and mistakes,
errors and
Understanding adverse events: human factors
81
violations - each having different psychological
origins and requiring different countermeasures. Nor can we take account of only
those human failures that were the proximal
causes of an accident. Major accident inquiries
(for example those for Three Mile Island
nuclear reactor accident, Challenger (space
shuttle) explosion, King's Cross underground
fire, Herald of Free Enterprise capsizing, Piper
Alpha explosion and fire, Clapham rail disaster,
Exxon Valdez oil spill, Kegworth air crash, etc)
make it apparent that the human causes of
major accidents are distributed very widely,
both within an organisation as a whole and
over several years before the actual event. In
consequence, we also need to distinguish
between active failures (having immediate
adverse outcomes) and latent or delayed action
failures that can exist for long periods before
combining with local triggering events to
penetrate the system's defences.
Human errors may be classified either by
their consequences or by their presumed
causes. Consequential classifications are already
widely used in medicine. The error is described
in terms of the proximal actions contributing
to a mishap (for example, administration of a
wrong drug or a wrong dose, wrong intubation,
nerve or blood vessel unintentionally severed
during surgery, etc). Causal classifications,
on the other hand, make assumptions about
the psychological mechanisms implicated in
generating the error. Since causal or psychological classifications are not widely used in
medicine (though there are notable exceptions,
see Gaba,4 Runciman et al9) a brief description
of the main distinctions among types of
errors and their underlying rationale is given
below.
Psychologists divide errors into two causally
determined groups (see Reason"), as summarised in figure 1.
SLIPS AND LAPSES VERSUS MISTAKES: THE FIRST
DISTINCTION
Error can be defined in many ways. For my
present purpose an error is the failure of
planned actions to achieve their desired goal.
There are basically two ways in which this
failure can occur, as follows.
* The plan is adequate, but the associated
actions do not go as intended. The failures
are failures of execution and are commonly termed slips and lapses. Slips relate
to observable actions and are associated
with attentional failures. Lapses are more
internal events and relate to failures of
memory.
* The actions may go entirely as planned, but
the plan is inadequate to achieve its intended
outcome. These are failures of intention,
termed mistakes. Mistakes can be further
subdivided into rule based mistakes and
knowledge based mistakes (see below).
All errors involve some kind of deviation. In the
case of slips, lapses, trips and fumbles, actions
deviate from the current intention. Here the
failure occurs at the level of execution. For
mistakes, the actions may go entirely as
planned but the plan itself deviates from some
adequate path towards its intended goal. Here
the failure lies at a higher level: with the mental
processes involved in planning, formulating
intentions, judging, and problem solving.
Slips and lapses occur during the largely
automatic performance of some routine task,
usually in familiar surroundings. They are
almost invariably associated with some form of
attentional capture, either distraction from the
immediate surroundings or preoccupation with
something in mind. They are also provoked by
change, either in the current plan of action
or in the immediate surroundings. Figure 2
shows the further subdivisions of slips and
lapses; these have been discussed in detail
elsewhere. "
Mistakes can begin to occur once a problem
has been detected. A problem is anything that
requires a change or alteration of the current
plan. Mistakes may be subdivided into two
groups, as follows.
* Rule based mistakes, which relate to problems for which the person possesses some
prepackaged solution, acquired as the result
of training, experience, or the availability
of appropriate procedures. The associated
errors may come in various forms: the misapplication of a good rule (usually because of
a failure to spot the contraindications), the
application of a bad rule, or the nonapplication of a good rule.
* Knowledge based mistakes, which occur in
novel situations where the solution to a problem has to be worked out on the spot without
the help of preprogrammed solutions. This
entails the use of slow, resource-limited
but computationally-powerful conscious
reasoning carried out in relation to what is
often an inaccurate and incomplete "mental
model" of the problem and its possible
causes. Under these circumstances the human
mind is subject to several powerful biases, of
which the most universal is confirmation
bias. This was described by Sir Francis Bacon
more than 300 years ago. "The human mind
*|~ ~0
i
*
l
Slips, lapses, trips,
and fumbles:
Execution failures
Errors
f |
Recognitio n failures|
*L Attentionalfailures
Slips and lapses
*|
Memory failures
A
Selection failures
Mistakes: planning
1 or problem solving
failures
Fig 1 Distinguishing slips, lapses, and mistakes
Fig 2 Varieties ofslips and lapses
|
82
Reason
when it has once adopted an opinion draws
all things else to support and agree with
it."'2 Confirmation bias or "mindset" is particularly evident when trying to diagnose
what has gone wrong with a malfunctioning
system. We "pattern match" a possible cause
to the available signs and symptoms and then
seek out only that evidence that supports this
particular hunch, ignoring or rationalising
away contradictory facts. Other biases have
been discussed elsewhere."
ERRORS VERSUS VIOLATIONS: THE SECOND
DISTINCTION
Violations are deviations from safe operating
practices, procedures, standards, or rules.
Here, we are mostly interested in deliberate
violations, in which the actions (though
not the possible bad consequences) were
intended.
Violations fall into three main groups.
* Routine violations, which entail cutting
corners whenever such opportunities present
themselves
* Optimising violations, or actions taken to
further personal rather than strictly task
related goals (that is, violations for "kicks" or
to alleviate boredom)
* Necessary or situational violations that seem
to offer the only path available to getting the
job done, and where the rules or procedures
are seen to be inappropriate for the present
situation.
Deliberate violations differ from errors in
several important ways.
* Whereas errors arise primarily from
informational problems (that is, forgetting,
inattention, incomplete knowledge, etc)
violations are more generally associated with
motivational problems (that is, low morale,
poor supervisory example, perceived lack of
concern, the failure to reward compliance
and sanction non-compliance, etc)
* Errors can be explained by what goes on in
the mind of an individual, but violations
occur in a regulated social context
* Errors can be reduced by improving the
quality and the delivery of necessary information within the workplace. Violations
require motivational and organizational
remedies.
At first sight the faults which led to this disaster were
the ... errors of omission on the part of the Master,
the Chief Officer and the assistant bosun ... But a
full investigation into the circumstances of the
disaster leads inexorably to the conclusion that the
underlying or cardinal faults lay higher up in the
Company ... From top to bottom the body corporate
was infected with the disease of sloppiness."
Here the distinction between active and latent
failures is made very clear. The active failures
- the immediate causes of the
capsize - were
various errors on the part of the ships' officers
and crew. But, as the inquiry disclosed, the
ship was a "sick" ship even before it sailed from
Zeebrugge on 6 March 1987.
To summarise the differences between active
and latent failures:
* Active failures are unsafe acts (errors and
violations) committed by those at the "sharp
end" of the system (surgeons, anaesthetists,
nurses, physicians, etc). It is the people at the
human-system interface whose actions can,
and sometimes do, have immediate adverse
consequences
* Latent failures are created as the result of
decisions, taken at the higher echelons of
an organisation. Their damaging consequences may lie dormant for a long time,
only becoming evident when they combine
with local triggering factors (for example,
the spring tide, the loading difficulties at
Zeebrugge harbour, etc) to breach the
system's defences.
Thus, the distinction between active and latent
failures rests on two considerations: firstly, the
length of time before the failures have a
bad outcome and, secondly, where in an
organisation the failures occur. Generally,
medical active failures are committed by those
people in direct contact with the patient, and
latent failures occur within the higher echelons
of the institution, in the organisational and
management spheres. A brief account of a
model showing how top level decisions create
conditions that produce accidents in the workplace is given below.
Aetiology of "organisational" accidents
The technological advances of the past 20
years, particularly in regard to engineered
safety features, have made many hazardous
systems largely proof against single failures,
ACTIVE VERSUS LATENT HUMAN FAILURES: THE
either human or mechanical. Breaching the
THIRD DISTINCTION
"defences in depth" now requires the unlikely
In considering how people contribute to confluence of several causal streams.
accidents a third and very important Unfortunately, the increased automation
distinction is necessary - namely, that between afforded by cheap computing power also
active and latent failures. The difference
concerns the length of time that passes before
human failures are shown to have an adverse
impact on safety. For active failures the
negative outcome is almost immediate, but for
latent failures the consequences of human
actions or decisions can take a long time to be
disclosed, sometimes many years.
The distinction between active and latent
failures owes much to Mr Justice Sheen's
observations on the capsizing of the Herald
of Free Enterprise. In his inquiry report, he
wrote:
provides greater opportunities for the insidious
accumulation of latent failures within the
system as a whole. Medical systems and items
of equipment have become more opaque to the
people who work them and are thus especially
prone to the rare, but often catastrophic,
organizationall accident." Tackling these
organisational failures represents a major
challenge in medicine and elsewhere.
Figure 3 shows the anatomy of an organisational accident, the direction of causality
being from left to right. The accident sequence
begins with the negative consequences of
Understanding adverse events: human factors
Corporate
culture
83
Local climate
Situation
Task
Defences
Barriers
Error-
Management
decisions
and
organizational
processes
producing
conditions
Violationproducing
conditions
Errors
H
I
Violations
Latent failures in defenses
Fig 3 Stages ofdevelopment oforganizational accident
organisational processes (that is, decisions concerned with planning, scheduling, forecasting,
designing, policy making, communicating,
regulating, maintaining, etc). The latent failures
so created are transmitted along various
organizational and departmental pathways to
the workplace (the operating theatre, the ward,
etc), where they create the local conditions
that promote the commission of errors and
violations (for example, understaffing, high
workload, poor human equipment interfaces,
etc). Many of these unsafe acts are likely to be
committed, but only very few of them will
penetrate the defences to produce damaging
outcomes. The fact that engineered safety
features, standards, controls, procedures, etc,
can be deficient due to latent failures as well
Case 1: Therac-25 accident at East Texas Medical Centre
(1986)
A 33 year old man was due to receive his ninth radiation treatment after
surgery for the removal of a tumour on his left shoulder. The radiotherapy
technician positioned him on the table and then went to her adjoining control
room. The Therac-25 machine had two modes: a high power "x ray" mode
and a low power "electron beam" mode. The high power mode was selected
by typing an "x" on the keyboard of the VT100 terminal. This put the
machine on maximum power and inserted a thick metal plate between the
beam generator and the patient. The plate transformed the 25 million volt
electron beam into therapeutic x rays. The low power mode was selected by
typing "e" and was designed to deliver a 200 rad beam to the tumour.
The intention on this occasion was to deliver the low power beam. But
the technician made a slip and typed in an "x" instead of an "e." She
immediately detected her error, pressed the "up" arrow to select the edit
functions from the screen menu and changed the incorrect "x" command
to the desired "e" command. The screen now confirmed that the machine
was in electron beam mode. She returned the cursor to the bottom of the
screen in preparation for the "beam ready" display showing that the machine
was fully charged. As soon as the "beam ready" signal appeared she depressed
the "b" key to activate the beam.
What she did not realism - and had no way of knowing - was that an
undetected bug in the software had retracted the thick metal protege plate
(used in the x ray mode) but had left the power setting on maximum. As
soon as she activated the "b" command, a blast of 25 000 rads was delivered
to the patient's unprotected shoulder. He saw a flash ofblue light (Cherenkov
radiation), heard his flesh frying, and felt an excruciating pain. He called out
to the technician, but both the voice and video intercom were switched off.
Meanwhile, back in the control room, the computer screen displayed a
"malfimction 54" error signal. This meant little to the technician. She took
it to mean that the beam had not fired, so reset the machine to fire again.
Once again, she received the "malfunction 54" signal, and once more she
reset and fired the machine. As a result, the patient received three, 25 000
rad blasts to his neck and upper torso, although the technician's display
showed that he had only received a tenth of his prescribed treatment dose.
The patient died four months later with gaping lesions on his upper body.
His wry comment was: "Captain Kirk forgot to put his phaser on stun."
A very similar incident occurred three weeks later. Subsequently,
comparable overdoses were discovered to have been administered in three
other centres using the same equipment.
as active failures is
arrow connecting
shown in the figure by the
organizational processes
directly to defences.
The model presents the people at the sharp
end as the inheritors rather than as the
instigators of an accident sequence. This may
seem as if the "blame" for accidents has been
shifted from the sharp end to the system
managers. But this is not the case for the
following reasons.
* The attribution of blame, though often
emotionally satisfying, hardly ever translates
into effective countermeasures. Blame implies
delinquency, and delinquency is normally
dealt with by exhortations and sanctions.
But these are wholly inappropriate if the
individual people concerned did not choose
to err in the first place, nor were not
appreciably prone to error.
* High level management and organisational
decisions are shaped by economic, political,
and operational constraints. Like designs,
decisions are nearly always a compromise. It
is thus axiomatic that all strategic decisions
will carry some negative safety consequences
for some part of the system. This is not to say
that all such decisions are flawed, though
some of them will be. But even those
decisions judged at the time as being good
ones will carry a potential downside.
Resources, for example, are rarely allocated
evenly. There are nearly always losers. In
judging uncertain futures some of the shots
will inevitably be called wrong. The crux of
the matter is that we cannot prevent the
creation of latent failures; we can only make
their adverse consequences visible before
they combine with local triggers to breach the
system's defences.
These organizational root causes are further
complicated by the fact that the healthcare system as a whole involves many
interdependent organizations: manufacturers,
government agencies, professional and patient
organizations, etc. The model shown in figure
3 relates primarily to a given institution, but the
reality is considerably more complex, with the
behaviour of other organizations impinging
on the accident sequence at many different
points.
Applying the organizational accident
model in medicine: two case studies
Two radiological case studies are presented
to give substance to this rather abstract
theoretical framework and to emphasise some
important points regarding the practice of high
tech medicine. Radiological mishaps tend to
be extensively investigated, particularly in the
United States, where these examples occurred.
But organisational accidents should not be
assumed to be unique to this specialty. An
entirely comparable anaesthetic case study
has been presented elsewhere.'4 15 Generally,
though, medical accidents have rarely been
investigated to the extent that their systemic
and institutional root causes are disclosed, so
the range of suitable case studies is limited.
The box describes details of the first case
study.
Reason
84
Several latent failures contributed to this
accident.
* The Canadian manufacturer had not considered it possible that a technician could
enter that particular sequence of keyboard
commands within the space of eight seconds
and so had not tested the effects of these
closely spaced inputs
* The technician had not been trained to
interpret the error signals
* It was regarded as normal practice to carry
out radiation treatment without video or
sound communication with the patient
* Perhaps most significantly, the technician
was provided with totally inadequate feedback regarding the state of the machine and
its prior activity.
This case study provides a clear example of
what has been called "clumsy automation."3 16 17 Automation intended to
reduced errors created by the variability of
human performance increases the probability
of certain kinds of mistakes by making the
system and its current state opaque to the
people who operate it. Comparable problems
have been identified in the control rooms of
nuclear power plants, on the flight decks of
modem airliners, and in relation to contemporary anaesthetic work stations.17 Automation and "defence in depth" mean that these
complex systems are largely protected against
single failures. But they render the workings of
the system more mysterious to its human
controllers. In addition, they permit the subtle
build up of latent failures, hidden behind
high technology interfaces and within the
interdepartmental interstices of complex
organisations.
The second case study has all the causal
hallmarks of an organizational accident but
differs from most medical mishaps in
having adverse outcomes for nearly 100
people. The accident is described in detail
elsewhere. 18
Case 2: Omnitron 2000 accident at Indiana Regional Cancer
Centre (1992)
An elderly patient with anal carcinoma was treated with high dose rate
(HDR) brachytherapy. Five catheters were placed in the tumhout An iridium192 source (4-3 cune, 1 6 E + 11 becquerel) was intended to be located in
various positions within each catheter, using a remotely controlled Omnitron
2000 afterloader. The treatment was the first of three treatments planned by
the doctor, and the catheters were to remain in the patient for the subsequent
treatments,
The iridium source wire was placed in four of the catheters without
apparent faculty, but after several unsuccessfid attempts to insert the source
wire into the fifth catheter, the treatment was terminated. In fact, a wire had
broken, leaving an iridium source inside one of the first four catheters. Four
days later the catheter Containing source came loose a. eventuallfell
out of t patient. It wUs picked up adplaced inma storagerotmbyb a member
of staff of the nursing home, who did not realise it was radioactive. Five days
later a truck picked up the waste bag containing the source. As part of the
driver's normal routine the bag was-then driven to the depot and remained
there fo a day (dirig7Thanksgiving) before being d d to a meIal
was detected'y fixed r on monies
waste incinerator wherebye source
at the site. It was left over the weekend but was then traced to the nursig
home. It was retrieved nearly three weeks after the original treatment. The
patient had died five days after the treatment session, and in the ensuing
weeks over 90 people had been irradiated in varying degrees b the idm
source.
The accident occurred as the result of a
combination of procedural violations (resulting
in breached or ignored defences) and latent
failures.
Active failures
* The area radiation monitor alarmed several
times during the treatment but was ignored,
partly because the doctor and technicians
knew that it had a history of false alarms
* The console indicator showed "safe" and the
attending staff mistakenly believed the source
to be fully retracted into the lead shield
* The truck driver deviated from company
procedures when he failed to check the
nursing home waste with his personal
radiation survey meter.
Latent failures
* The rapid expansion of high dose rate
brachytherapy, from one to ten facilities in
less than a year, had created serious weaknesses in the radiation safety programme
* Too much reliance was placed on unwritten
or informal procedures and working
practices
* There were serious inadequacies in the
design and testing of the equipment
* There was
a poor organisational safety
culture. The technicians routinely ignored
alarms and did not survey patients, the afterloader, or the treatment room after high dose
rate procedures.
* There was weak regulatory oversight. The
Nuclear Regulatory Commission did not
adequately address the problems and dangers
associated with high dose rate procedures.
This case study illustrates how a combination
of active failures and latent systemic weaknesses can conspire to penetrate the many
layers of defences which are designed to
protect both patients and staff. No one person
was to blame; each person acted according
to his or her appraisal of the situation, yet
one person died and over 90 people were
irradiated.
Principled risk management
In many organisations managing the human
risks has concentrated on trying to prevent
the recurrence of specific errors and violations
that have been implicated in particular local
mishaps. The common internal response to
such events is to issue new procedures that
proscribe the particular behaviour; to devise
engineering "retro-fixes" that will prevent such
actions having adverse outcomes; to sanction,
exhort, and retrain key staff in an effort to make
them more careful; and to introduce increased
automation. This "anti-personnel" approach
has several problems.
(1) People do not intend to commit errors. It
is therefore difficult for others to control
what people cannot control for themselves.
(2) The psychological precursors of an error
(that is, inattention, distraction, preoccupation, forgetting, fatigue, and stress)
are probably the last and least manageable
links in the chain of events leading to an
error.
Understanding adverse events: human factors
85
(3) Accidents rarely occur as the result of
single unsafe acts. They are the product of
many factors: personal, task related, situational, and organisational. This has two
implications. Firstly, the mere recurrence
of some act involved in a previous accident
will probably not have an adverse outcome
in the absence of the other causal factors.
Secondly, so long as these underlying
latent problems persist, other acts not
can also
hitherto regarded as unsafe
serve to complete an incipient accident
-
-
sequence.
(4) These countermeasures can create a false
sense of security.3 Since modem systems
are usually highly reliable some time is
likely to pass between implementing these
personnel related measures and the next
mishap. During this time, those who have
instituted the changes are inclined to
believe that they have fixed the problem.
But then a different kind of mishap occurs,
and the cycle of local repairs begins all over
again. Such accidents tend to be viewed
in isolation, rather than being seen as
symptomatic of some underling systemic
malaise.
(5) Increased automation does not cure the
human factors problem, it simply changes
its nature. Systems become more opaque to
their operators. Instead of causing harm by
slips, lapses, trips and fumbles, people are
now more prone to make mistaken judgements about the state of the system.
The goal of effective risk management is not
so much to minimise particular errors and
violations as to enhance human performance at
all levels of the system.3 Perhaps paradoxically,
most performance enhancement measures are
not directly focused at what goes on inside
the heads of single individuals. Rather, they
are directed at team, task, situation, and
organisational factors, as discussed below.
TEAM FACTORS
A great deal of health care is delivered by
multidisciplinary teams. Over a decade of
experience in aviation (and, more recently,
marine technology) has shown that measures
designed to improve team management and
the quality of the communications between
team members can have an enormous impact
on human performance. Helmreich (one of the
pioneers of crew resource management) and
his colleagues at the University of Texas
analysed 51 aircraft accidents and incidents,
paying special attention to team related
factors.'9 The box summarises their findings,
where the team related factors are categorised
as negative (having an adverse impact upon
safety and survivability) or positive (acting to
improve survivability). The numbers given in
each case relate to the number of accidents or
incidents in which particular team related
factors had a negative or a positive role.
This list offers clear recommendations for
the interactions of medical teams just as much
as for aircraft crews. Recently, the aviation
psychologist Robert Helmreich and the anaesthetist Hans-Gerhard Schaefer studied team
performance in the operating theatre of a Swiss
teaching hospital.20 They noted that "interpersonal and communications issues are
responsible for many inefficiencies, errors, and
frustrations in this psychologically and organisationally complex environment."8 They also
observed that attempts to improve institutional
performance largely entailed throwing money
at the problem through the acquisition of new
and ever more advanced equipment whereas
improvements to training and team performance could be achieved more effectively at
a fraction of this cost. As has been clearly
Team related factors and role in 51
aircraft accidents and incidents*
Team concept and environment for open
communications established (negative 7;
positive 2)
Briefings are operationally thorough, interesting,
and address crew coordination and planning for
potential problems. Expectations are set for how
possible deviations from normal operations are
to be handled (negative 9; positive 2)
Cabin crew are included as part of the team in
briefings, as appropriate, and guidelines are
established for coordination between flight deck
and cabin (negative 2)
Group climate is appropriate to operational
situation (for example, presence of social
conversation). Crew ensures that nonoperational factors such as social interaction do
not interfere with necessary tasks (negative 13;
positive 4)
Crew members ask questions regarding crew
actions and decisions (negative 1 1; positive 4)
Crew members speak up and state their
information with appropriate persistence until
there is some clear resolution or decision
(negative 14; positive 4)
Captain coordinates flight deck activities to
establish proper balance between command
authority and crew member participation and
acts decisively when the situation requires it
(negative 18; positive 4)
Workload and task distribution are clearly
communicated and acknowledged by crew
members. Adequate time is provided for the
completion of tasks (negative 12; positive 4)
Secondary tasks are prioritised to allow sufficient
resources for dealing effectively with primary
duties (negative 5; positive 2)
Crew members check with each other during
times of high and low workload to maintain
situational awareness and alertness (negative 3;
positive 3)
Crew prepares for expected contingency
situations (negative 28; positive 4)
Guidelines are established for the operation and
disablement of automated systems. Duties and
responsibilities with regard to automated
systems are made clear. Crew periodically review
and verify the status of automated systems.
Crew verbalises and acknowledges entries and
changes to automated systems. Crew allows
sufficient time for programming automated
systems before manoeuvres (negative 14)
When conflicts arise the crew remains focused
on the problem or situation at hand. Crew
members listen actively to ideas and opinions
and admit mistakes when wrong (negative 2)
*After Helmreich et all 9
86
Reason
shown for aviation, formal training in team
management and communication skills can
Table 1 Summary of error producing conditions ranked in
order of known effect (after Williams2)
produce substantial improvements in human
performance as well as reducing safety-critical
Condition
errors.
TASK FACTORS
Tasks
widely in their liability to promote
Identifying and modifying tasks and
task elements that are conspicuously prone to
failure are essential steps in risk management.
The following simple example is representative of many maintenance tasks. Imagine a
bolt with eight nuts on it. Each nut is coded
and has to be located in a particular sequence.
Disassembly is virtually error free. There is
only one way in which the nuts can be removed
from the bolt and all the necessary knowledge
to perform this task is located in the world (that
is, each step in the procedure is automatically
cued by the preceding one). But the task of
correct reassembly is immensely more difficult.
There are over 40 000 ways in which this
assemblage of nuts can be wrongly located on
the bolt (factorial 8). In addition, the knowledge necessary to get the nuts back in the right
order has to be either memorised or read from
some written procedure, both of which are
highly liable to error or neglect. Such an
example may seem at first sight to be far
removed from the practice of medicine, but
medical equipment, like any other sophisticated
hardware, requires careful maintenance and maintenance errors (particularly omitting
necessary reassembly steps) constitute one of
vary
errors.
the greatest sources of human factors problems
in high technology industries."
Effective incident monitoring is an invaluable tool in identifying tasks prone to
error. On the basis of their body of nearly
4000 anaesthetic and intensive care incidents,
Runciman et al at the Royal Adelaide Hospital
(see Runciman et al9 for a report of the first
2000 incidents) introduced many inexpensive
equipment modifications guaranteed to
enhance performance and to minimise
recurrent errors. These include colour coded
syringes and endotracheal tubes graduated to
help non-intrusive identification of endobronchial intubation.2'
SITUATIONAL FACTORS
Unfamiliarity with the task
Time shortage
Poor signal:noise ratio
Poor human system interface
Designer user mismatch
Irreversibility of errors
Information overload
Negative transfer between tasks
Misperception of risk
Poor feedback from system
Inexperience - not lack of training
Poor instructions or procedures
Inadequate checking
Educational mismatch of person with task
Disturbed sleep patterns
Hostile environment
Monotony and boredom
Risk factor
(X 17)
(X I 1)
(X 10)
(x 8)
(X 8)
(X 8)
(X 6)
(X 5)
(X 4)
(X4)
(X 3)
(X 3)
(X 3)
(X 2)
(X 1-6)
(X 1 2)
(X 1)
best researched factors - namely, sleep
disturbance, hostile environment, and boredom - carry the least penalties. Also, those
error producing factors at the top of the list are
those that lie squarely within the organisational
sphere of influence. This is a central element
in the present view of organisational accidents.
Managers and administrators rarely, if ever,
have the opportunity to jeopardise a system's
safety directly. Their influence is more indirect:
top level decisions create the conditions that
promote unsafe acts.
For convenience, error producing conditions
can be reduced to seven broad categories: high
workload; inadequate knowledge, ability or
experience, poor interface design; inadequate
supervision or instruction; stressful environment; mental state (fatigue, boredom, etc); and
change. Departures from routine and changes
in the circumstances in which actions are
normally performed constitute a major factor
in absentminded slips of action.23
Compared to error producing conditions,
the factors that promote violations are less
well understood. Ranking their relative effects
is not possible. However, we can make an
informed guess at the nature of these vtolationproducing conditions, as shown in table 2,
although in no particular order of effect.
Again, for causal analysis this list can be
reduced to a few general categories: lack of
safety culture, lack of concern, poor morale,
norms condoning violation, "can do" attitudes,
and apparently meaningless or ambiguous
rules.
Each type of task has its own nominal error
probability. For example, carrying out a totally
novel task with no clear idea of the likely Table 2 Violation producing conditions, unranked
consequences (that is, knowledge based
Conditions
processing) has a basic error probability of
Manifest
lack of organisational safety culture
0 75. At the other extreme, a highly familiar, Conflict between
management and staff
routine task performed by a well motivated and Poor morale
Poor
and
supervision
checking
competent workforce has an error probability Group norms condoning
violations
of 0 0005. But there are certain conditions Misperception of hazards
Perceived
lack
of
management
both of the individual person and his or her Little elan or pride in work care and concern
immediate environment that are guaranteed to Culture that encourages
taking risks
Beliefs that bad outcomes will not happen
increase these nominal error probabilities Low
self esteem
(table 1). Here the error producing conditions Learned helplessness
Perceived licence to bend rules
are ranked in the order of their known effects
Ambiguous or apparently meaningless rules
and the numbers in parentheses indicate the Rules inapplicable due to local conditions
tools and equipment
risk factor (that is, the amount by which the Inadequate
Inadequate training
nominal error rates should be multiplied under Time pressure
the worst conditions). Notably, three of the Professional attitudes hostile to procedures
Understanding adverse events: human factors
87
ORGANISATIONAL FACTORS
Quality and safety, like health and happiness,
have two aspects: a negative aspect disclosed
by incidents and accidents and a positive
aspect, to do with the system's intrinsic resistance to human factors problems. Whereas
incidents and accidents convert easily into
numbers, trends, and targets, the positive
aspect is much harder to identity and measure.
Accident and incident reporting procedures
are a crucial part of any safety or quality
information system. But, by themselves, they
are insufficient to support effective quality and
safety management. The information they
provide is both too little and too late for this
longer term purpose. To promote proactive
accident prevention rather than reactive "local
repairs" an organisation's "vital signs" should
be monitored regularly.
When a doctor carries out a routine medical
check he or she samples the state of several
critical bodily systems: the cardiovascular,
pulmonary, excretory, neurological systems,
and so on. From individual measures of blood
pressure, electrocardiographic activity, cholesterol concentration, urinary contents, reflexes,
and so on the doctor makes a professional
judgement about the individual's general state
of health. There is no direct, definitive measure
of a person's health. It is an emergent property
inferred from a selection of physiological signs
and lifestyle indicators. The same is also true
from complex hazardous systems. Assessing an
organisation's current state of "safety health,"
as in medicine, entails regular and judicious
sampling of a small subset of a potentially large
number of indices. But what are the dimensions along which to assess organisational
"safety health?"
Several such diagnostic techniques are
already being implemented in various
industries.24 The individual labels for the
assessed dimensions vary from industry to
industry (oil exploration and production,
tankers, helicopters, railway operations, and
aircraft engineering), but all of them have been
guided by two principles. Firstly, they try to
include those organizational "pathogens" that
have featured most conspicuously in well
documented accidents (that is, hardware
defects, incompatible goals, poor operating
procedures, understaffing, high workload,
inadequate training, etc). Secondly, they seek
to encompass a representative sampling of
those core processes common to all technological organizations (that is, design, build,
operate, maintain, manage, communicate, etc).
Since there is unlikely to be a single universal
set of indicators for all types of hazardous
operations one way of communicating how
safety health can be assessed is simply to list
the organisational factors that are currently
measured (see table 3). Tripod-Delta, commissioned by Shell International and currently
implemented in several of its exploration and
production operating companies, on Shell
tankers, and on its contracted helicopters in the
North Sea, assesses the quarterly or half yearly
state of 11 general failure types in specific
workplaces: hardware, design, maintenance
management, procedures, error enforcing
conditions, housekeeping, incompatible goals,
organizational structure, communication,
training, and defences. A discussion of the
rationale behind the selection and measurement of these failure types can be found
elsewhere.25
Tripod-Delta uses tangible, dimension
related indicators as direct measures or
"symptoms" of the state of each of the 11
failure types. These indicators are generated by
task specialists and are assembled into checklists by a computer program (Delta) for each
testing occasion. The nature of the indicators
varies from activity to activity (that is, drilling,
seismic surveys, transport, etc) and from test
to test. Examples of such indicators for
design associated with an offshore platform
are listed below. All questions have yes/no
answers.
Was this platform originally designed to be
unmanned?
* Are shut-off valves fitted at a height of more
than 2 metres?
* Is standard (company) coding used for the
pipes?
* Are there locations on this platform where
the deck and walkways differ in height?
* Have there been more than two unscheduled
maintenance jobs over the past week?
*Are there any bad smells from the low
pressure vent system?
Relatively few of the organizational and
managerial factors listed in table 3 are specific
to safety; rather, they relate to the quality of the
overall system. As such, they can also be used
to gauge proactively the likelihood of negative
outcomes other than coming into damaging
*
Table 3 Measures of organisational health used in different industrial settings
Railways
Oil exploration and production
Hardware
Design
Maintenance management
Procedures
Error enforcing conditions
Housekeeping
Incompatible goals
Organisation
Communication
Training
Defences
Tools and equipment
Materials
Supervision
Working environment
Staff attitudes
Housekeeping
Contractors
Design
Staff communication
Departmental communication
Staffing and fostering
Training
Planning
Rules
Management
Maintenance
Aircraft maintenance
Organisational structure
People management
Provision and quality of tools and equipment
Training and selection
Commercial and operational pressures
Planning and scheduling
Maintenance of buildings and equipment
Communication
Reason
88
contact with physical hazards, such as loss of
market share, bankruptcy, and liability to
criminal prosecution or civil law suits.
The measurements derived from TripodDelta are summarised as bar graph profiles.
Their purpose is to identify the two or three
factors most in need of remediation and to
track changes over time. Maintaining adequate
safety health is thus comparable to a long term
fitness programme in which the focus of
remedial efforts switches from dimension to
dimension as previously salient factors improve
and new ones come into prominence. Like life,
effective safety management is "one thing after
another." Striving for the best attainable level
of intrinsic resistance to operational hazards is
like fighting a guerrilla war. One can expect no
absolute victories. There are no "Waterloos" in
the safety war.
Summary and conclusions
(1) Human rather than technical failures now
represent the greatest threat to complex
and potentially hazardous systems. This
includes healthcare systems.
(2) Managing the human risks will never be
100% effective. Human fallibility can be
moderated, but it cannot be eliminated.
(3) Different error types have different underlying mechanisms, occur in different parts
of the organisation, and require different
methods of risk management. The basic
distinctions are between:
* Slips, lapses, trips, and fumbles (execution failures) and mistakes (planning
or problem solving failures). Mistakes
are divided into rule based mistakes and
knowledge based mistakes
* Errors (information-handling problems)
and violations (motivational problems)
* Active versus latent failures. Active
failures are committed by those in direct
contact with the patient, latent failures
arise in organizational and managerial
spheres and their adverse effects may
take a long time to become evident.
(4) Safety significant errors occur at all levels
of the system, not just at the sharp end.
Decisions made in the upper echelons of
the organisation create the conditions in
the workplace that subsequently promote
individual errors and violations. Latent
failures are present long before an
accident and are hence prime candidates
for principled risk management.
(5) Measures that involve sanctions and
exhortations (that is, moralistic measures
directed to those at the sharp end) have
only very limited effectiveness, especially
so in the case of highly trained
professionals.
(6) Human factors problems are a product of
a chain of causes in which the individual
psychological factors (that is, momentary
inattention, forgetting, etc) are the last
and least manageable links. Attentional
"capture" (preoccupation or distraction)
is a necessary condition for the commission of slips and lapses. Yet its
occurrence is almost impossible to predict
or control effectively. The same is true of
the factors associated with forgetting.
States of mind contributing to error are
thus extremely difficult to manage; they
can happen to the best of people at any
time.
(7) People do not act in isolation. Their
behaviour is shaped by circumstances.
The same is true for errors and violations.
The likelihood of an unsafe act being
committed is heavily influenced by the
nature of the task and by the local workplace conditions. These, in turn, are
the product of "upstream" organizational
factors. Great gains in safety can be
achieved through relatively small modifications of equipment and workplaces.
(8) Automation and increasingly advanced
equipment do not cure human factors
problems, they merely relocate them.
In contrast, training people to work
effectively in teams costs little, but has
achieved significant enhancements of
human performance in aviation.
(9) Effective risk management depends
critically on a confidential and preferably
anonymous incident monitoring system
that records the individual, task, situational, and organizational factors associated with incidents and near misses.
(10) Effective risk management means the
simultaneous and targeted deployment of
limited remedial resources at different
levels of the system: the individual or
team, the task, the situation, and the
organisation as a whole.
1
2
3
4
5
6
7
8
9
10
11
12
13
Brennan TA, Leape LL, Laird NM, Herbert L, Localio AR,
Lawthers AG, et al. Incidence of adverse events and
negligence in hospitalized patients: results from the
Harvard medical practice study 1. New Engl J Med
1991 ;324:370-6.
Leape LL, Brennan TA, Laird NM, Lawthers AG,
Localio AR, Barnes BA, et al. The nature of adverse
events in hospitalized patients: results from the Harvard
medical practice study II. New Engl J Med 1991;324:
377-84.
Cook RI, Woods DD. Operating at the sharp end: the
complexity of human error. In: Bogner MS, ed. Human
errors in medicine. Hillsdale, New Jersey: Erlbaum,
1994:255-310.
Gaba DM. Human error in anesthetic mishaps. Int
Anesthesiol Clin 1989;27:137-47.
Gaba DM. Human error in dynamic medical domains.
In: Bogner MS, ed. Human errors in medicine. Hillsdale,
New Jersey: Erlbaum, 1994:197-224.
Perrow C. Normal accidents. New York: Basic Books, 1984.
Vincent C, Ennis M, Audley RJ. Medical accidents. Oxford:
Oxford University Press, 1993.
Bogner MS. Human error in medicine. Hillsdale, New Jersey:
Erlbaum, 1994.
Runciman WB, Sellen A, Webb RK, Williamson JA,
Currie M, Morgan C, et al. Errors, incidents and
accidents in anaesthetic practice. Anaesth Intensive Care
1993;21:506-19.
Hollnagel E. Reliability of cognition: foundations of human
reliability analysis. London: Academic Press, 1993.
Reason J. Human error. New York: Cambridge University
Press, 1990.
Bacon F. In: Anderson F, ed. The new Organon.
Indianapolis: Bobbs-Merrill, 1960. (Originally published
1620.)
Sheen. MV Herald of Free Enterprise. Report of court No 8074
formal investigation. London: Department of Transport,
1987.
14 Eagle CJ, Davies JM, Reason JT. Accident analysis of large
scale technological disasters applied to an anaesthetic
complication. Canadian Journal of Anaesthesia 1992;39:
118-22.
15 Reason J. The human factor in medical accidents. In:
Vincent C, Ennis M, Audley R, eds. Medical accidents.
Oxford: Oxford University Press, 1993:1-16.
16 Wiener EL. Human factors of advanced technology ("glass
cockpit") transport aircraft. Moffett Field, California:
NASA Ames Research Center, 1989. Technical report
117528.
Understanding adverse events: human factors
17 Woods DD, Johannesen JJ, Cook RI, Sarter NB. Behind
human error: cognitive systems, computers, and hindsight.
Wright-Patterson Air Force Base, Ohio: Crew Systems
Ergonomics Information Analysis Center, 1994.
(CSERIAC state of the art report.)
18 NUREG. Loss of an iridium-192 source and therapy
misadministration at Indiana Regional Cancer Center,
Indiana, Pennsylvania, on November 16, 1992.
Washington, DC: US Nuclear Regulatory Commission,
1993. (NUREG-1480.)
19 Helmreich RL, Butler RA, Taggart WR, Wilhem JA.
Behavioral markers in accidents and incidents: reference list.
Austin, Texas: University of Texas, 1994. (Technical
report 94-3; NASA/University of Texas FAA Aerospace
Crew Research Project.)
20 Helmreich RL, Schaefer H-G. Team performance in the
operating room. In: Bogner MS, ed. Human errors in
medicine. Hillsdale, New Jersey: Erlbaum, 1994.
89
21 Runciman WB. Anaesthesia incident monitoring study. In:
Incident monitoring and risk management in the health care
sector. Canberra: Commonwealth Department of Human
Services and Health, 1994:13-5.
22 Williams J. A data-based method for assessing and reducing
human error to improve operational performance. In:
Hagen W, ed. 1988 IEEE Fourth Conference on Human
Factors and Power Plants. New York: Institute of Electrical
and Electronic Engineers, 1988:200-31.
23 Reason J, Mycielska K. Absent-minded? The psychology of
mental lapses and everyday errors. Englewood Cliffs, New
Jersey: Prentice-Hall, 1982.
24 Reason J. A systems approach to organisation errors.
Ergonomics (in press).
25 Hudson P, Reason J, Wagenaar W, Bentley P, Primrose M,
Visser J. Tripod Delta: proactive approach to enhanced
safety. Journal of Petroleum Technology 1994;46:
58-62.