Academia.eduAcademia.edu

Understanding adverse events: human factors

1995, Quality and Safety in Health Care

AI-generated Abstract

The paper discusses the increasing involvement of human factors specialists in understanding and preventing medical accidents, highlighting the parallels between medical mishaps and catastrophic failures in high-risk industries. It emphasizes that a significant percentage of patients experience unintended injuries due to treatment, underscoring the importance of examining human and organizational contributions to these events. By drawing on models and measures from other sectors, the paper advocates for effective strategies to improve the safety culture and overall reliability of healthcare provision.

80 Quality in Health Care 1995;4:80-89 Understanding adverse events: human factors James Reason Department of Psychology, University of Manchester, Manchester M13 9PL James Reason, professor A decade ago, very few specialists in human factors were involved in the study and prevention of medical accidents. Now there are many. Between the 1940s and 1980s a major concern of that community was to limit the human contribution to the conspicuously catastrophic breakdown of high hazard enterprises such as air, sea, and road transport; nuclear power generation; chemical process plants; and the like. Accidents in these systems cost many lives, create widespread environmental damage, and generate much public and political concern. By contrast, medical mishaps mostly affect single individuals in a wide variety of healthcare institutions and are seldom discussed publicly. Only within the past few years has the likely extent of these accidental injuries become apparent. The Harvard medical practice study found that 4% of patients in hospital in New York City in 1984 sustained unintended injuries caused by their treatment. For New York state this amounted to 98 600 injuries in one year and, when extrapolated to the entire United States, to the staggering figure of 1-3 million people harmed annually- more than twice the number injured in one year in road accidents in the United States.' 2 Since the mid-i 980s several interdisciplinary research groups have begun to investigate the human and organizational factors affecting the reliability of healthcare provision. Initially, these collaborations were focused around the work of anaesthetists and intensiviStS,3 4 partly because these professionals' activities shared much in common with those of more widely studied groups such as pilots and operators of nuclear power plants. This commonality existed at two levels. * At the "sharp end" (that is, at the immediate human-system or doctor-patient interface) common features include uncertain and dynamic environments, multiple sources of concurrent information, shifting and often ill defined goals, reliance on indirect or inferred indications, actions having immediate and multiple consequences, moments of intense time stress interspersed with long periods of routine activity, advanced technologies with many redundancies, complex and often confusing human-machine interfaces, and multiple players with differing priorities and high stakes.5 * At an organisational level these activities are carried on within complex, tightly coupled institutional settings and entail multiple interactions between different professional groups.6 This is extremely important for understanding not only the character and aetiology of medical mishaps but also for devising more effective remedial measures. More recently, the interest in the human factors of health care has spread to a wide range of medical specialties (for example, general practice, accident and emergency care, obstetrics and gynaecology, radiology, psychiatry, surgery, etc). This burgeoning concern is reflected in several recent texts and journal articles devoted to medical accidents7`9 and in the creation of incident monitoring schemes that embody leading edge thinking with regard to human and organisational contributions.9 One of the most significant consequences of the collaboration between specialists in medicine and in human factors is the widespread acceptance that models of causation of accidents developed for domains such as aviation and nuclear power generation apply equally well to most healthcare applications. The same is also true for many of the diagnostic and remedial measures that have been created within these non-medical areas. I will first consider the different ways in which humans can contribute to the breakdown of complex, well defended technologies. Then I will show how these various contributions may be combined within a generic model of accident causation and illustrate its practical application with two case studies of medical accidents. Finally, I will outline the practical implications of such models for improving risk management within the healthcare domain. Human contribution A recent survey of published work on human factors disclosed that the estimated contribution of human error to accidents in hazardous technologies increased fourfold between the 1960s and '90s, from minima of around 20% to maxima of beyond 90%. 10 One possible inference is that people have become more prone to error. A likelier explanation, however, is that equipment has become more reliable and that accident investigators have become increasingly aware that safety-critical errors are not restricted to the "sharp end." Figures of around 90% are hardly surprising considering that people design, build, operate, maintain, organism, and manage these systems. The large contribution of human error is more a matter of opportunity than the result of excessive carelessness, ignorance, or recklessness. Whatever the true figure, though, human behaviour - for good or ill - clearly dominates the risks to modern technological systems medical or otherwise. Not long ago, these human contributions would have been lumped together under the catch all label of "human error." Now it is apparent that unsafe acts come in many forms - slips, lapses and mistakes, errors and Understanding adverse events: human factors 81 violations - each having different psychological origins and requiring different countermeasures. Nor can we take account of only those human failures that were the proximal causes of an accident. Major accident inquiries (for example those for Three Mile Island nuclear reactor accident, Challenger (space shuttle) explosion, King's Cross underground fire, Herald of Free Enterprise capsizing, Piper Alpha explosion and fire, Clapham rail disaster, Exxon Valdez oil spill, Kegworth air crash, etc) make it apparent that the human causes of major accidents are distributed very widely, both within an organisation as a whole and over several years before the actual event. In consequence, we also need to distinguish between active failures (having immediate adverse outcomes) and latent or delayed action failures that can exist for long periods before combining with local triggering events to penetrate the system's defences. Human errors may be classified either by their consequences or by their presumed causes. Consequential classifications are already widely used in medicine. The error is described in terms of the proximal actions contributing to a mishap (for example, administration of a wrong drug or a wrong dose, wrong intubation, nerve or blood vessel unintentionally severed during surgery, etc). Causal classifications, on the other hand, make assumptions about the psychological mechanisms implicated in generating the error. Since causal or psychological classifications are not widely used in medicine (though there are notable exceptions, see Gaba,4 Runciman et al9) a brief description of the main distinctions among types of errors and their underlying rationale is given below. Psychologists divide errors into two causally determined groups (see Reason"), as summarised in figure 1. SLIPS AND LAPSES VERSUS MISTAKES: THE FIRST DISTINCTION Error can be defined in many ways. For my present purpose an error is the failure of planned actions to achieve their desired goal. There are basically two ways in which this failure can occur, as follows. * The plan is adequate, but the associated actions do not go as intended. The failures are failures of execution and are commonly termed slips and lapses. Slips relate to observable actions and are associated with attentional failures. Lapses are more internal events and relate to failures of memory. * The actions may go entirely as planned, but the plan is inadequate to achieve its intended outcome. These are failures of intention, termed mistakes. Mistakes can be further subdivided into rule based mistakes and knowledge based mistakes (see below). All errors involve some kind of deviation. In the case of slips, lapses, trips and fumbles, actions deviate from the current intention. Here the failure occurs at the level of execution. For mistakes, the actions may go entirely as planned but the plan itself deviates from some adequate path towards its intended goal. Here the failure lies at a higher level: with the mental processes involved in planning, formulating intentions, judging, and problem solving. Slips and lapses occur during the largely automatic performance of some routine task, usually in familiar surroundings. They are almost invariably associated with some form of attentional capture, either distraction from the immediate surroundings or preoccupation with something in mind. They are also provoked by change, either in the current plan of action or in the immediate surroundings. Figure 2 shows the further subdivisions of slips and lapses; these have been discussed in detail elsewhere. " Mistakes can begin to occur once a problem has been detected. A problem is anything that requires a change or alteration of the current plan. Mistakes may be subdivided into two groups, as follows. * Rule based mistakes, which relate to problems for which the person possesses some prepackaged solution, acquired as the result of training, experience, or the availability of appropriate procedures. The associated errors may come in various forms: the misapplication of a good rule (usually because of a failure to spot the contraindications), the application of a bad rule, or the nonapplication of a good rule. * Knowledge based mistakes, which occur in novel situations where the solution to a problem has to be worked out on the spot without the help of preprogrammed solutions. This entails the use of slow, resource-limited but computationally-powerful conscious reasoning carried out in relation to what is often an inaccurate and incomplete "mental model" of the problem and its possible causes. Under these circumstances the human mind is subject to several powerful biases, of which the most universal is confirmation bias. This was described by Sir Francis Bacon more than 300 years ago. "The human mind *|~ ~0 i * l Slips, lapses, trips, and fumbles: Execution failures Errors f | Recognitio n failures| *L Attentionalfailures Slips and lapses *| Memory failures A Selection failures Mistakes: planning 1 or problem solving failures Fig 1 Distinguishing slips, lapses, and mistakes Fig 2 Varieties ofslips and lapses | 82 Reason when it has once adopted an opinion draws all things else to support and agree with it."'2 Confirmation bias or "mindset" is particularly evident when trying to diagnose what has gone wrong with a malfunctioning system. We "pattern match" a possible cause to the available signs and symptoms and then seek out only that evidence that supports this particular hunch, ignoring or rationalising away contradictory facts. Other biases have been discussed elsewhere." ERRORS VERSUS VIOLATIONS: THE SECOND DISTINCTION Violations are deviations from safe operating practices, procedures, standards, or rules. Here, we are mostly interested in deliberate violations, in which the actions (though not the possible bad consequences) were intended. Violations fall into three main groups. * Routine violations, which entail cutting corners whenever such opportunities present themselves * Optimising violations, or actions taken to further personal rather than strictly task related goals (that is, violations for "kicks" or to alleviate boredom) * Necessary or situational violations that seem to offer the only path available to getting the job done, and where the rules or procedures are seen to be inappropriate for the present situation. Deliberate violations differ from errors in several important ways. * Whereas errors arise primarily from informational problems (that is, forgetting, inattention, incomplete knowledge, etc) violations are more generally associated with motivational problems (that is, low morale, poor supervisory example, perceived lack of concern, the failure to reward compliance and sanction non-compliance, etc) * Errors can be explained by what goes on in the mind of an individual, but violations occur in a regulated social context * Errors can be reduced by improving the quality and the delivery of necessary information within the workplace. Violations require motivational and organizational remedies. At first sight the faults which led to this disaster were the ... errors of omission on the part of the Master, the Chief Officer and the assistant bosun ... But a full investigation into the circumstances of the disaster leads inexorably to the conclusion that the underlying or cardinal faults lay higher up in the Company ... From top to bottom the body corporate was infected with the disease of sloppiness." Here the distinction between active and latent failures is made very clear. The active failures - the immediate causes of the capsize - were various errors on the part of the ships' officers and crew. But, as the inquiry disclosed, the ship was a "sick" ship even before it sailed from Zeebrugge on 6 March 1987. To summarise the differences between active and latent failures: * Active failures are unsafe acts (errors and violations) committed by those at the "sharp end" of the system (surgeons, anaesthetists, nurses, physicians, etc). It is the people at the human-system interface whose actions can, and sometimes do, have immediate adverse consequences * Latent failures are created as the result of decisions, taken at the higher echelons of an organisation. Their damaging consequences may lie dormant for a long time, only becoming evident when they combine with local triggering factors (for example, the spring tide, the loading difficulties at Zeebrugge harbour, etc) to breach the system's defences. Thus, the distinction between active and latent failures rests on two considerations: firstly, the length of time before the failures have a bad outcome and, secondly, where in an organisation the failures occur. Generally, medical active failures are committed by those people in direct contact with the patient, and latent failures occur within the higher echelons of the institution, in the organisational and management spheres. A brief account of a model showing how top level decisions create conditions that produce accidents in the workplace is given below. Aetiology of "organisational" accidents The technological advances of the past 20 years, particularly in regard to engineered safety features, have made many hazardous systems largely proof against single failures, ACTIVE VERSUS LATENT HUMAN FAILURES: THE either human or mechanical. Breaching the THIRD DISTINCTION "defences in depth" now requires the unlikely In considering how people contribute to confluence of several causal streams. accidents a third and very important Unfortunately, the increased automation distinction is necessary - namely, that between afforded by cheap computing power also active and latent failures. The difference concerns the length of time that passes before human failures are shown to have an adverse impact on safety. For active failures the negative outcome is almost immediate, but for latent failures the consequences of human actions or decisions can take a long time to be disclosed, sometimes many years. The distinction between active and latent failures owes much to Mr Justice Sheen's observations on the capsizing of the Herald of Free Enterprise. In his inquiry report, he wrote: provides greater opportunities for the insidious accumulation of latent failures within the system as a whole. Medical systems and items of equipment have become more opaque to the people who work them and are thus especially prone to the rare, but often catastrophic, organizationall accident." Tackling these organisational failures represents a major challenge in medicine and elsewhere. Figure 3 shows the anatomy of an organisational accident, the direction of causality being from left to right. The accident sequence begins with the negative consequences of Understanding adverse events: human factors Corporate culture 83 Local climate Situation Task Defences Barriers Error- Management decisions and organizational processes producing conditions Violationproducing conditions Errors H I Violations Latent failures in defenses Fig 3 Stages ofdevelopment oforganizational accident organisational processes (that is, decisions concerned with planning, scheduling, forecasting, designing, policy making, communicating, regulating, maintaining, etc). The latent failures so created are transmitted along various organizational and departmental pathways to the workplace (the operating theatre, the ward, etc), where they create the local conditions that promote the commission of errors and violations (for example, understaffing, high workload, poor human equipment interfaces, etc). Many of these unsafe acts are likely to be committed, but only very few of them will penetrate the defences to produce damaging outcomes. The fact that engineered safety features, standards, controls, procedures, etc, can be deficient due to latent failures as well Case 1: Therac-25 accident at East Texas Medical Centre (1986) A 33 year old man was due to receive his ninth radiation treatment after surgery for the removal of a tumour on his left shoulder. The radiotherapy technician positioned him on the table and then went to her adjoining control room. The Therac-25 machine had two modes: a high power "x ray" mode and a low power "electron beam" mode. The high power mode was selected by typing an "x" on the keyboard of the VT100 terminal. This put the machine on maximum power and inserted a thick metal plate between the beam generator and the patient. The plate transformed the 25 million volt electron beam into therapeutic x rays. The low power mode was selected by typing "e" and was designed to deliver a 200 rad beam to the tumour. The intention on this occasion was to deliver the low power beam. But the technician made a slip and typed in an "x" instead of an "e." She immediately detected her error, pressed the "up" arrow to select the edit functions from the screen menu and changed the incorrect "x" command to the desired "e" command. The screen now confirmed that the machine was in electron beam mode. She returned the cursor to the bottom of the screen in preparation for the "beam ready" display showing that the machine was fully charged. As soon as the "beam ready" signal appeared she depressed the "b" key to activate the beam. What she did not realism - and had no way of knowing - was that an undetected bug in the software had retracted the thick metal protege plate (used in the x ray mode) but had left the power setting on maximum. As soon as she activated the "b" command, a blast of 25 000 rads was delivered to the patient's unprotected shoulder. He saw a flash ofblue light (Cherenkov radiation), heard his flesh frying, and felt an excruciating pain. He called out to the technician, but both the voice and video intercom were switched off. Meanwhile, back in the control room, the computer screen displayed a "malfimction 54" error signal. This meant little to the technician. She took it to mean that the beam had not fired, so reset the machine to fire again. Once again, she received the "malfunction 54" signal, and once more she reset and fired the machine. As a result, the patient received three, 25 000 rad blasts to his neck and upper torso, although the technician's display showed that he had only received a tenth of his prescribed treatment dose. The patient died four months later with gaping lesions on his upper body. His wry comment was: "Captain Kirk forgot to put his phaser on stun." A very similar incident occurred three weeks later. Subsequently, comparable overdoses were discovered to have been administered in three other centres using the same equipment. as active failures is arrow connecting shown in the figure by the organizational processes directly to defences. The model presents the people at the sharp end as the inheritors rather than as the instigators of an accident sequence. This may seem as if the "blame" for accidents has been shifted from the sharp end to the system managers. But this is not the case for the following reasons. * The attribution of blame, though often emotionally satisfying, hardly ever translates into effective countermeasures. Blame implies delinquency, and delinquency is normally dealt with by exhortations and sanctions. But these are wholly inappropriate if the individual people concerned did not choose to err in the first place, nor were not appreciably prone to error. * High level management and organisational decisions are shaped by economic, political, and operational constraints. Like designs, decisions are nearly always a compromise. It is thus axiomatic that all strategic decisions will carry some negative safety consequences for some part of the system. This is not to say that all such decisions are flawed, though some of them will be. But even those decisions judged at the time as being good ones will carry a potential downside. Resources, for example, are rarely allocated evenly. There are nearly always losers. In judging uncertain futures some of the shots will inevitably be called wrong. The crux of the matter is that we cannot prevent the creation of latent failures; we can only make their adverse consequences visible before they combine with local triggers to breach the system's defences. These organizational root causes are further complicated by the fact that the healthcare system as a whole involves many interdependent organizations: manufacturers, government agencies, professional and patient organizations, etc. The model shown in figure 3 relates primarily to a given institution, but the reality is considerably more complex, with the behaviour of other organizations impinging on the accident sequence at many different points. Applying the organizational accident model in medicine: two case studies Two radiological case studies are presented to give substance to this rather abstract theoretical framework and to emphasise some important points regarding the practice of high tech medicine. Radiological mishaps tend to be extensively investigated, particularly in the United States, where these examples occurred. But organisational accidents should not be assumed to be unique to this specialty. An entirely comparable anaesthetic case study has been presented elsewhere.'4 15 Generally, though, medical accidents have rarely been investigated to the extent that their systemic and institutional root causes are disclosed, so the range of suitable case studies is limited. The box describes details of the first case study. Reason 84 Several latent failures contributed to this accident. * The Canadian manufacturer had not considered it possible that a technician could enter that particular sequence of keyboard commands within the space of eight seconds and so had not tested the effects of these closely spaced inputs * The technician had not been trained to interpret the error signals * It was regarded as normal practice to carry out radiation treatment without video or sound communication with the patient * Perhaps most significantly, the technician was provided with totally inadequate feedback regarding the state of the machine and its prior activity. This case study provides a clear example of what has been called "clumsy automation."3 16 17 Automation intended to reduced errors created by the variability of human performance increases the probability of certain kinds of mistakes by making the system and its current state opaque to the people who operate it. Comparable problems have been identified in the control rooms of nuclear power plants, on the flight decks of modem airliners, and in relation to contemporary anaesthetic work stations.17 Automation and "defence in depth" mean that these complex systems are largely protected against single failures. But they render the workings of the system more mysterious to its human controllers. In addition, they permit the subtle build up of latent failures, hidden behind high technology interfaces and within the interdepartmental interstices of complex organisations. The second case study has all the causal hallmarks of an organizational accident but differs from most medical mishaps in having adverse outcomes for nearly 100 people. The accident is described in detail elsewhere. 18 Case 2: Omnitron 2000 accident at Indiana Regional Cancer Centre (1992) An elderly patient with anal carcinoma was treated with high dose rate (HDR) brachytherapy. Five catheters were placed in the tumhout An iridium192 source (4-3 cune, 1 6 E + 11 becquerel) was intended to be located in various positions within each catheter, using a remotely controlled Omnitron 2000 afterloader. The treatment was the first of three treatments planned by the doctor, and the catheters were to remain in the patient for the subsequent treatments, The iridium source wire was placed in four of the catheters without apparent faculty, but after several unsuccessfid attempts to insert the source wire into the fifth catheter, the treatment was terminated. In fact, a wire had broken, leaving an iridium source inside one of the first four catheters. Four days later the catheter Containing source came loose a. eventuallfell out of t patient. It wUs picked up adplaced inma storagerotmbyb a member of staff of the nursing home, who did not realise it was radioactive. Five days later a truck picked up the waste bag containing the source. As part of the driver's normal routine the bag was-then driven to the depot and remained there fo a day (dirig7Thanksgiving) before being d d to a meIal was detected'y fixed r on monies waste incinerator wherebye source at the site. It was left over the weekend but was then traced to the nursig home. It was retrieved nearly three weeks after the original treatment. The patient had died five days after the treatment session, and in the ensuing weeks over 90 people had been irradiated in varying degrees b the idm source. The accident occurred as the result of a combination of procedural violations (resulting in breached or ignored defences) and latent failures. Active failures * The area radiation monitor alarmed several times during the treatment but was ignored, partly because the doctor and technicians knew that it had a history of false alarms * The console indicator showed "safe" and the attending staff mistakenly believed the source to be fully retracted into the lead shield * The truck driver deviated from company procedures when he failed to check the nursing home waste with his personal radiation survey meter. Latent failures * The rapid expansion of high dose rate brachytherapy, from one to ten facilities in less than a year, had created serious weaknesses in the radiation safety programme * Too much reliance was placed on unwritten or informal procedures and working practices * There were serious inadequacies in the design and testing of the equipment * There was a poor organisational safety culture. The technicians routinely ignored alarms and did not survey patients, the afterloader, or the treatment room after high dose rate procedures. * There was weak regulatory oversight. The Nuclear Regulatory Commission did not adequately address the problems and dangers associated with high dose rate procedures. This case study illustrates how a combination of active failures and latent systemic weaknesses can conspire to penetrate the many layers of defences which are designed to protect both patients and staff. No one person was to blame; each person acted according to his or her appraisal of the situation, yet one person died and over 90 people were irradiated. Principled risk management In many organisations managing the human risks has concentrated on trying to prevent the recurrence of specific errors and violations that have been implicated in particular local mishaps. The common internal response to such events is to issue new procedures that proscribe the particular behaviour; to devise engineering "retro-fixes" that will prevent such actions having adverse outcomes; to sanction, exhort, and retrain key staff in an effort to make them more careful; and to introduce increased automation. This "anti-personnel" approach has several problems. (1) People do not intend to commit errors. It is therefore difficult for others to control what people cannot control for themselves. (2) The psychological precursors of an error (that is, inattention, distraction, preoccupation, forgetting, fatigue, and stress) are probably the last and least manageable links in the chain of events leading to an error. Understanding adverse events: human factors 85 (3) Accidents rarely occur as the result of single unsafe acts. They are the product of many factors: personal, task related, situational, and organisational. This has two implications. Firstly, the mere recurrence of some act involved in a previous accident will probably not have an adverse outcome in the absence of the other causal factors. Secondly, so long as these underlying latent problems persist, other acts not can also hitherto regarded as unsafe serve to complete an incipient accident - - sequence. (4) These countermeasures can create a false sense of security.3 Since modem systems are usually highly reliable some time is likely to pass between implementing these personnel related measures and the next mishap. During this time, those who have instituted the changes are inclined to believe that they have fixed the problem. But then a different kind of mishap occurs, and the cycle of local repairs begins all over again. Such accidents tend to be viewed in isolation, rather than being seen as symptomatic of some underling systemic malaise. (5) Increased automation does not cure the human factors problem, it simply changes its nature. Systems become more opaque to their operators. Instead of causing harm by slips, lapses, trips and fumbles, people are now more prone to make mistaken judgements about the state of the system. The goal of effective risk management is not so much to minimise particular errors and violations as to enhance human performance at all levels of the system.3 Perhaps paradoxically, most performance enhancement measures are not directly focused at what goes on inside the heads of single individuals. Rather, they are directed at team, task, situation, and organisational factors, as discussed below. TEAM FACTORS A great deal of health care is delivered by multidisciplinary teams. Over a decade of experience in aviation (and, more recently, marine technology) has shown that measures designed to improve team management and the quality of the communications between team members can have an enormous impact on human performance. Helmreich (one of the pioneers of crew resource management) and his colleagues at the University of Texas analysed 51 aircraft accidents and incidents, paying special attention to team related factors.'9 The box summarises their findings, where the team related factors are categorised as negative (having an adverse impact upon safety and survivability) or positive (acting to improve survivability). The numbers given in each case relate to the number of accidents or incidents in which particular team related factors had a negative or a positive role. This list offers clear recommendations for the interactions of medical teams just as much as for aircraft crews. Recently, the aviation psychologist Robert Helmreich and the anaesthetist Hans-Gerhard Schaefer studied team performance in the operating theatre of a Swiss teaching hospital.20 They noted that "interpersonal and communications issues are responsible for many inefficiencies, errors, and frustrations in this psychologically and organisationally complex environment."8 They also observed that attempts to improve institutional performance largely entailed throwing money at the problem through the acquisition of new and ever more advanced equipment whereas improvements to training and team performance could be achieved more effectively at a fraction of this cost. As has been clearly Team related factors and role in 51 aircraft accidents and incidents* Team concept and environment for open communications established (negative 7; positive 2) Briefings are operationally thorough, interesting, and address crew coordination and planning for potential problems. Expectations are set for how possible deviations from normal operations are to be handled (negative 9; positive 2) Cabin crew are included as part of the team in briefings, as appropriate, and guidelines are established for coordination between flight deck and cabin (negative 2) Group climate is appropriate to operational situation (for example, presence of social conversation). Crew ensures that nonoperational factors such as social interaction do not interfere with necessary tasks (negative 13; positive 4) Crew members ask questions regarding crew actions and decisions (negative 1 1; positive 4) Crew members speak up and state their information with appropriate persistence until there is some clear resolution or decision (negative 14; positive 4) Captain coordinates flight deck activities to establish proper balance between command authority and crew member participation and acts decisively when the situation requires it (negative 18; positive 4) Workload and task distribution are clearly communicated and acknowledged by crew members. Adequate time is provided for the completion of tasks (negative 12; positive 4) Secondary tasks are prioritised to allow sufficient resources for dealing effectively with primary duties (negative 5; positive 2) Crew members check with each other during times of high and low workload to maintain situational awareness and alertness (negative 3; positive 3) Crew prepares for expected contingency situations (negative 28; positive 4) Guidelines are established for the operation and disablement of automated systems. Duties and responsibilities with regard to automated systems are made clear. Crew periodically review and verify the status of automated systems. Crew verbalises and acknowledges entries and changes to automated systems. Crew allows sufficient time for programming automated systems before manoeuvres (negative 14) When conflicts arise the crew remains focused on the problem or situation at hand. Crew members listen actively to ideas and opinions and admit mistakes when wrong (negative 2) *After Helmreich et all 9 86 Reason shown for aviation, formal training in team management and communication skills can Table 1 Summary of error producing conditions ranked in order of known effect (after Williams2) produce substantial improvements in human performance as well as reducing safety-critical Condition errors. TASK FACTORS Tasks widely in their liability to promote Identifying and modifying tasks and task elements that are conspicuously prone to failure are essential steps in risk management. The following simple example is representative of many maintenance tasks. Imagine a bolt with eight nuts on it. Each nut is coded and has to be located in a particular sequence. Disassembly is virtually error free. There is only one way in which the nuts can be removed from the bolt and all the necessary knowledge to perform this task is located in the world (that is, each step in the procedure is automatically cued by the preceding one). But the task of correct reassembly is immensely more difficult. There are over 40 000 ways in which this assemblage of nuts can be wrongly located on the bolt (factorial 8). In addition, the knowledge necessary to get the nuts back in the right order has to be either memorised or read from some written procedure, both of which are highly liable to error or neglect. Such an example may seem at first sight to be far removed from the practice of medicine, but medical equipment, like any other sophisticated hardware, requires careful maintenance and maintenance errors (particularly omitting necessary reassembly steps) constitute one of vary errors. the greatest sources of human factors problems in high technology industries." Effective incident monitoring is an invaluable tool in identifying tasks prone to error. On the basis of their body of nearly 4000 anaesthetic and intensive care incidents, Runciman et al at the Royal Adelaide Hospital (see Runciman et al9 for a report of the first 2000 incidents) introduced many inexpensive equipment modifications guaranteed to enhance performance and to minimise recurrent errors. These include colour coded syringes and endotracheal tubes graduated to help non-intrusive identification of endobronchial intubation.2' SITUATIONAL FACTORS Unfamiliarity with the task Time shortage Poor signal:noise ratio Poor human system interface Designer user mismatch Irreversibility of errors Information overload Negative transfer between tasks Misperception of risk Poor feedback from system Inexperience - not lack of training Poor instructions or procedures Inadequate checking Educational mismatch of person with task Disturbed sleep patterns Hostile environment Monotony and boredom Risk factor (X 17) (X I 1) (X 10) (x 8) (X 8) (X 8) (X 6) (X 5) (X 4) (X4) (X 3) (X 3) (X 3) (X 2) (X 1-6) (X 1 2) (X 1) best researched factors - namely, sleep disturbance, hostile environment, and boredom - carry the least penalties. Also, those error producing factors at the top of the list are those that lie squarely within the organisational sphere of influence. This is a central element in the present view of organisational accidents. Managers and administrators rarely, if ever, have the opportunity to jeopardise a system's safety directly. Their influence is more indirect: top level decisions create the conditions that promote unsafe acts. For convenience, error producing conditions can be reduced to seven broad categories: high workload; inadequate knowledge, ability or experience, poor interface design; inadequate supervision or instruction; stressful environment; mental state (fatigue, boredom, etc); and change. Departures from routine and changes in the circumstances in which actions are normally performed constitute a major factor in absentminded slips of action.23 Compared to error producing conditions, the factors that promote violations are less well understood. Ranking their relative effects is not possible. However, we can make an informed guess at the nature of these vtolationproducing conditions, as shown in table 2, although in no particular order of effect. Again, for causal analysis this list can be reduced to a few general categories: lack of safety culture, lack of concern, poor morale, norms condoning violation, "can do" attitudes, and apparently meaningless or ambiguous rules. Each type of task has its own nominal error probability. For example, carrying out a totally novel task with no clear idea of the likely Table 2 Violation producing conditions, unranked consequences (that is, knowledge based Conditions processing) has a basic error probability of Manifest lack of organisational safety culture 0 75. At the other extreme, a highly familiar, Conflict between management and staff routine task performed by a well motivated and Poor morale Poor and supervision checking competent workforce has an error probability Group norms condoning violations of 0 0005. But there are certain conditions Misperception of hazards Perceived lack of management both of the individual person and his or her Little elan or pride in work care and concern immediate environment that are guaranteed to Culture that encourages taking risks Beliefs that bad outcomes will not happen increase these nominal error probabilities Low self esteem (table 1). Here the error producing conditions Learned helplessness Perceived licence to bend rules are ranked in the order of their known effects Ambiguous or apparently meaningless rules and the numbers in parentheses indicate the Rules inapplicable due to local conditions tools and equipment risk factor (that is, the amount by which the Inadequate Inadequate training nominal error rates should be multiplied under Time pressure the worst conditions). Notably, three of the Professional attitudes hostile to procedures Understanding adverse events: human factors 87 ORGANISATIONAL FACTORS Quality and safety, like health and happiness, have two aspects: a negative aspect disclosed by incidents and accidents and a positive aspect, to do with the system's intrinsic resistance to human factors problems. Whereas incidents and accidents convert easily into numbers, trends, and targets, the positive aspect is much harder to identity and measure. Accident and incident reporting procedures are a crucial part of any safety or quality information system. But, by themselves, they are insufficient to support effective quality and safety management. The information they provide is both too little and too late for this longer term purpose. To promote proactive accident prevention rather than reactive "local repairs" an organisation's "vital signs" should be monitored regularly. When a doctor carries out a routine medical check he or she samples the state of several critical bodily systems: the cardiovascular, pulmonary, excretory, neurological systems, and so on. From individual measures of blood pressure, electrocardiographic activity, cholesterol concentration, urinary contents, reflexes, and so on the doctor makes a professional judgement about the individual's general state of health. There is no direct, definitive measure of a person's health. It is an emergent property inferred from a selection of physiological signs and lifestyle indicators. The same is also true from complex hazardous systems. Assessing an organisation's current state of "safety health," as in medicine, entails regular and judicious sampling of a small subset of a potentially large number of indices. But what are the dimensions along which to assess organisational "safety health?" Several such diagnostic techniques are already being implemented in various industries.24 The individual labels for the assessed dimensions vary from industry to industry (oil exploration and production, tankers, helicopters, railway operations, and aircraft engineering), but all of them have been guided by two principles. Firstly, they try to include those organizational "pathogens" that have featured most conspicuously in well documented accidents (that is, hardware defects, incompatible goals, poor operating procedures, understaffing, high workload, inadequate training, etc). Secondly, they seek to encompass a representative sampling of those core processes common to all technological organizations (that is, design, build, operate, maintain, manage, communicate, etc). Since there is unlikely to be a single universal set of indicators for all types of hazardous operations one way of communicating how safety health can be assessed is simply to list the organisational factors that are currently measured (see table 3). Tripod-Delta, commissioned by Shell International and currently implemented in several of its exploration and production operating companies, on Shell tankers, and on its contracted helicopters in the North Sea, assesses the quarterly or half yearly state of 11 general failure types in specific workplaces: hardware, design, maintenance management, procedures, error enforcing conditions, housekeeping, incompatible goals, organizational structure, communication, training, and defences. A discussion of the rationale behind the selection and measurement of these failure types can be found elsewhere.25 Tripod-Delta uses tangible, dimension related indicators as direct measures or "symptoms" of the state of each of the 11 failure types. These indicators are generated by task specialists and are assembled into checklists by a computer program (Delta) for each testing occasion. The nature of the indicators varies from activity to activity (that is, drilling, seismic surveys, transport, etc) and from test to test. Examples of such indicators for design associated with an offshore platform are listed below. All questions have yes/no answers. Was this platform originally designed to be unmanned? * Are shut-off valves fitted at a height of more than 2 metres? * Is standard (company) coding used for the pipes? * Are there locations on this platform where the deck and walkways differ in height? * Have there been more than two unscheduled maintenance jobs over the past week? *Are there any bad smells from the low pressure vent system? Relatively few of the organizational and managerial factors listed in table 3 are specific to safety; rather, they relate to the quality of the overall system. As such, they can also be used to gauge proactively the likelihood of negative outcomes other than coming into damaging * Table 3 Measures of organisational health used in different industrial settings Railways Oil exploration and production Hardware Design Maintenance management Procedures Error enforcing conditions Housekeeping Incompatible goals Organisation Communication Training Defences Tools and equipment Materials Supervision Working environment Staff attitudes Housekeeping Contractors Design Staff communication Departmental communication Staffing and fostering Training Planning Rules Management Maintenance Aircraft maintenance Organisational structure People management Provision and quality of tools and equipment Training and selection Commercial and operational pressures Planning and scheduling Maintenance of buildings and equipment Communication Reason 88 contact with physical hazards, such as loss of market share, bankruptcy, and liability to criminal prosecution or civil law suits. The measurements derived from TripodDelta are summarised as bar graph profiles. Their purpose is to identify the two or three factors most in need of remediation and to track changes over time. Maintaining adequate safety health is thus comparable to a long term fitness programme in which the focus of remedial efforts switches from dimension to dimension as previously salient factors improve and new ones come into prominence. Like life, effective safety management is "one thing after another." Striving for the best attainable level of intrinsic resistance to operational hazards is like fighting a guerrilla war. One can expect no absolute victories. There are no "Waterloos" in the safety war. Summary and conclusions (1) Human rather than technical failures now represent the greatest threat to complex and potentially hazardous systems. This includes healthcare systems. (2) Managing the human risks will never be 100% effective. Human fallibility can be moderated, but it cannot be eliminated. (3) Different error types have different underlying mechanisms, occur in different parts of the organisation, and require different methods of risk management. The basic distinctions are between: * Slips, lapses, trips, and fumbles (execution failures) and mistakes (planning or problem solving failures). Mistakes are divided into rule based mistakes and knowledge based mistakes * Errors (information-handling problems) and violations (motivational problems) * Active versus latent failures. Active failures are committed by those in direct contact with the patient, latent failures arise in organizational and managerial spheres and their adverse effects may take a long time to become evident. (4) Safety significant errors occur at all levels of the system, not just at the sharp end. Decisions made in the upper echelons of the organisation create the conditions in the workplace that subsequently promote individual errors and violations. Latent failures are present long before an accident and are hence prime candidates for principled risk management. (5) Measures that involve sanctions and exhortations (that is, moralistic measures directed to those at the sharp end) have only very limited effectiveness, especially so in the case of highly trained professionals. (6) Human factors problems are a product of a chain of causes in which the individual psychological factors (that is, momentary inattention, forgetting, etc) are the last and least manageable links. Attentional "capture" (preoccupation or distraction) is a necessary condition for the commission of slips and lapses. Yet its occurrence is almost impossible to predict or control effectively. The same is true of the factors associated with forgetting. States of mind contributing to error are thus extremely difficult to manage; they can happen to the best of people at any time. (7) People do not act in isolation. Their behaviour is shaped by circumstances. The same is true for errors and violations. The likelihood of an unsafe act being committed is heavily influenced by the nature of the task and by the local workplace conditions. These, in turn, are the product of "upstream" organizational factors. Great gains in safety can be achieved through relatively small modifications of equipment and workplaces. (8) Automation and increasingly advanced equipment do not cure human factors problems, they merely relocate them. In contrast, training people to work effectively in teams costs little, but has achieved significant enhancements of human performance in aviation. (9) Effective risk management depends critically on a confidential and preferably anonymous incident monitoring system that records the individual, task, situational, and organizational factors associated with incidents and near misses. (10) Effective risk management means the simultaneous and targeted deployment of limited remedial resources at different levels of the system: the individual or team, the task, the situation, and the organisation as a whole. 1 2 3 4 5 6 7 8 9 10 11 12 13 Brennan TA, Leape LL, Laird NM, Herbert L, Localio AR, Lawthers AG, et al. Incidence of adverse events and negligence in hospitalized patients: results from the Harvard medical practice study 1. New Engl J Med 1991 ;324:370-6. Leape LL, Brennan TA, Laird NM, Lawthers AG, Localio AR, Barnes BA, et al. The nature of adverse events in hospitalized patients: results from the Harvard medical practice study II. New Engl J Med 1991;324: 377-84. Cook RI, Woods DD. Operating at the sharp end: the complexity of human error. In: Bogner MS, ed. Human errors in medicine. Hillsdale, New Jersey: Erlbaum, 1994:255-310. Gaba DM. Human error in anesthetic mishaps. Int Anesthesiol Clin 1989;27:137-47. Gaba DM. Human error in dynamic medical domains. In: Bogner MS, ed. Human errors in medicine. Hillsdale, New Jersey: Erlbaum, 1994:197-224. Perrow C. Normal accidents. New York: Basic Books, 1984. Vincent C, Ennis M, Audley RJ. Medical accidents. Oxford: Oxford University Press, 1993. Bogner MS. Human error in medicine. Hillsdale, New Jersey: Erlbaum, 1994. Runciman WB, Sellen A, Webb RK, Williamson JA, Currie M, Morgan C, et al. Errors, incidents and accidents in anaesthetic practice. Anaesth Intensive Care 1993;21:506-19. Hollnagel E. Reliability of cognition: foundations of human reliability analysis. London: Academic Press, 1993. Reason J. Human error. New York: Cambridge University Press, 1990. Bacon F. In: Anderson F, ed. The new Organon. Indianapolis: Bobbs-Merrill, 1960. (Originally published 1620.) Sheen. MV Herald of Free Enterprise. Report of court No 8074 formal investigation. London: Department of Transport, 1987. 14 Eagle CJ, Davies JM, Reason JT. Accident analysis of large scale technological disasters applied to an anaesthetic complication. Canadian Journal of Anaesthesia 1992;39: 118-22. 15 Reason J. The human factor in medical accidents. In: Vincent C, Ennis M, Audley R, eds. Medical accidents. Oxford: Oxford University Press, 1993:1-16. 16 Wiener EL. Human factors of advanced technology ("glass cockpit") transport aircraft. Moffett Field, California: NASA Ames Research Center, 1989. Technical report 117528. Understanding adverse events: human factors 17 Woods DD, Johannesen JJ, Cook RI, Sarter NB. Behind human error: cognitive systems, computers, and hindsight. Wright-Patterson Air Force Base, Ohio: Crew Systems Ergonomics Information Analysis Center, 1994. (CSERIAC state of the art report.) 18 NUREG. Loss of an iridium-192 source and therapy misadministration at Indiana Regional Cancer Center, Indiana, Pennsylvania, on November 16, 1992. Washington, DC: US Nuclear Regulatory Commission, 1993. (NUREG-1480.) 19 Helmreich RL, Butler RA, Taggart WR, Wilhem JA. Behavioral markers in accidents and incidents: reference list. Austin, Texas: University of Texas, 1994. (Technical report 94-3; NASA/University of Texas FAA Aerospace Crew Research Project.) 20 Helmreich RL, Schaefer H-G. Team performance in the operating room. In: Bogner MS, ed. Human errors in medicine. Hillsdale, New Jersey: Erlbaum, 1994. 89 21 Runciman WB. Anaesthesia incident monitoring study. In: Incident monitoring and risk management in the health care sector. Canberra: Commonwealth Department of Human Services and Health, 1994:13-5. 22 Williams J. A data-based method for assessing and reducing human error to improve operational performance. In: Hagen W, ed. 1988 IEEE Fourth Conference on Human Factors and Power Plants. New York: Institute of Electrical and Electronic Engineers, 1988:200-31. 23 Reason J, Mycielska K. Absent-minded? The psychology of mental lapses and everyday errors. Englewood Cliffs, New Jersey: Prentice-Hall, 1982. 24 Reason J. A systems approach to organisation errors. Ergonomics (in press). 25 Hudson P, Reason J, Wagenaar W, Bentley P, Primrose M, Visser J. Tripod Delta: proactive approach to enhanced safety. Journal of Petroleum Technology 1994;46: 58-62.