Busting The Myths of Alarm Management

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Busting the Myths of

Alarm Management
Bill Hollifield
Principal Alarm
Management Consultant
PAS

2008 Pipeline Conference and Cybernetics Symposium


April 2008, Orlando, FL

Alarm Management Myths Abound!

Alarm Management is a major issue

Inexperienced, self-proclaimed experts are out


there

Misinformation is on the internet

Proper Alarm Management will help improve safety


and reliability of industrial plants

What went into the book:

Over 12 years of experience & over 100 personyears of effort

Comprehensive compilation of best practices

Lessons learned from hundreds of successful


projects

Practical, field-proven strategies and techniques

A significant update to EEMUA 191


ISA Version
Page 2

How did we get in this mess?


4000

Alarms Per Day

3500
3000

Recorded
Max. Acceptable (300)
Manageable (150)

6000

2500
2000

5000

1500

Alarm Events
4000

1000
500

3000

0
196 0 1970 1980 19 90 200 0
Alarms Per Operator Position

Configured

2000
1000
0

Thousands of alarms that must be


screened / dropped / ignored by the
operator!
Not a safe or desirable situation!
Page 3

- 8 Weeks -

Operator Alarm
Handling
Capacity

Some Benefits of an Overloaded Alarm System


Be on the TV news! Thats always good.

Page 4

Some Benefits of an Overloaded Alarm System

Get to know your OSHA inspectors really well.


They just want to help you.

Page 5

Poor Alarm Systems Encourage Operating by Alarm


No way to run a process:
Alarm! Too High!

Alarm! Right of
course!

Alarm!
Too
Low!
Alarm!
Left of
course!

Page 6

The Main Myths of Alarm Management

You dont need an Alarm Philosophy

Alarm Management is about Software!

Alarm Management is about Counting Your


Alarms

Alarm Management is about Getting Rid of Alarms

Alarm Management is something you can buy

Alarm Management is about Endless Consulting


Services

Page 7

Overloaded Alarm Systems are Easy to Create


Step 1:
Unpack the DCS
Box

Step 3: Mission
accomplished!
Enjoy!

Moneywell

Infinity and Beyond

YokoOno

Landscape
Cleamans

Loxburrow
BCC

Yamaguchi

Scaba

Mostly Electric

Endorphin Melta-P

Re S
co up Ad
m er u
m v lt
en isi
de on
d!

Step 2:
Turn on all the
alarms supplied
by the
manufacturer
(Theyre free!)

This end up
E-Z assembly

HAL 9000
(for APC, some bugs reported)

HI-HI Value

Significant Change

Configuration Error

HI Value

Deviation High

Non-Normal Mode

LO Value

Deviation Low

Off-Normal

LO-LO Value

Output High

Command-Disagree

Rate-of-Change Positive Output Low

Logic Output and more

Rate-of-change Negative Value Out-Of-Range

Add many more for Fieldbus!

Page 8

Overloaded Alarm Systems Are Easy To Create!


#1. Dont waste time thinking. Use rules of thumb instead!

90%

HHH 95%
HH 90%

Turn on all the Analog Limit


alarms

80%

H 80%

Turn on all the Rate-ofchange alarms

Turn on all of the Deviation


alarms

Turn on all of the Off-Normal


alarms

100%

Percentage

70%
60%

Alarm! Value
Returning to
Normal Range!!!

50%
40%
30%
20%
10%
0%

Get creative L 20%


Make up some
LL 10% new ones!
LLL 5%

Analog Point
Page 9

and so forth

The Cure: Seven Steps to Highly Effective Alarm Management


Step 1: Develop, Adopt, and Maintain an Alarm
Philosophy
Step 2: Collect Data and Benchmark Your Systems

Always
Needed
Often Done
Simultaneously

Step 3: Perform Bad Actor Alarm Resolution


Step 4: Perform Alarm Documentation and Rationalization
(D&R)
Step 5: Implement Alarm Audit and Enforcement
Technology
Step 6: Implement Real Time Alarm Management
Step 7: Control and Maintain Your Improved System

Page 10

Needed
Based Upon
Performance

Myth: You Dont Need an Alarm Philosophy


Alarm Philosophy:
A complete, customized,
and comprehensive
document covering
how to do alarms right
at your location.

CONTENTS Of An Alarm Philosophy


1.0 Alarm Philosophy Introduction
2.0 Purpose and Use
3.0 Alarm Definition and Criteria
4.0 Alarm Annunciation and Response
4.1 Navigation and Alarm Response
4.2 Use of External Annunciators
4.3 Hardwired Switches
4.4 Annunciated Alarm Priority
5.0 Alarm System Performance
5.1 Alarm System Champion
5.2 Alarm System KPIs
5.3 Alarm Performance Report
6.0 Alarm Handling Methods
6.1 Nuisance Alarms
6.2 Alarm Shelving
6.3 State-Based Alarms
6.4 Alarm Flood Suppression
6.5 Operator Alert Systems
7.0 Alarm Rationalization
7.1 Areas of Impact and
Severity of Consequences
7.2 Maximum Time for Response
and Correction
7.3 Priority Matrix
7.4 Alarm Documentation
7.5 Alarm Trip Point Selection
7.6 The Focused D&R Option

8.0 Specific Alarm Design Considerations


8.1 Handling of Alarms from Instrument
Malfunctions
8.2 Alarms for Redundant Sensors and
Voting Systems
8.3 External Device Health and Status Alarms
8.4 ESD Systems
8.5 ESD Bypasses
8.6 Duplicate Alarms
8.7 Consequential Alarms
8.8 Pre-Alarms
8.9 Flammable and Toxic Gas Detectors
8.10 Safety Shower and Eyebath Actuation Alarms
8.11 Building-Related Alarms
8.12 Alarm Handling for Programs
8.13 Alarms to Initiate Manual Tasks
8.14 DCS System Status Alarms
8.15 Point and Program References to Alarms
8.16 Operator Messaging System
9.0 Management of Change
10.0 Training
11.0 Alarm Maintenance Workflow Process
Plus Appendices

If you do not specify how to do alarms right, hundreds of world-wide


examples indicate that alarms will be done wrong.

Alarm Philosophies must be developed, they cannot just be


bought.

Page 11

The Primary Principles for Alarm Creation


Alarms notify the operator
of events requiring action
The

commonly violated rules:

Alarmed events must require operator action


Alarm must be based on the best indicator of the situations root cause
Alarm must result from a truly abnormal situations, never from normal
situations
Alarm

systems are so easy to use that they are used for


all sorts of inappropriate purposes!

Page 12

Common Ways to Violate these Principles


Create alarms that indicate the system is
working as expected, or normally.
Wrong: Alarm Successful Operation
Alarm: Step 1 Complete

Spare Pumps: commonly


alarmed incorrectly:

Alarm: Step 2 Complete

Running No
Alarm

Alarm: Step 3 Complete


Alarm: Step 4 Complete
Right:

Alarm Unsuccessful Operation

Alarm: Step 2 Failed to Complete


Status changes are shown via graphics,
not by misusing the alarm system!
Page 13

Not Running
Off-Normal
Alarm
Do not alarm things that are off.
Alarm them only when they are
off but are supposed to be on!

Myth: Alarm Management is About Software

Poorly performing alarm systems do not create themselves!

Proper Work Practices are needed to correct or create a


properly performing alarm system

Software is just a tool to identify problems and augment


proper Work Practices

Common improper Work Practices relative to alarm systems:

Uncontrolled Alarm Suppression

Failure to fix nuisance alarms

Improper alarm creation practices

Failure to monitor and report performance

Improper alarm prioritization

Failure to document alarms

Uncontrolled change of alarm settings

Improper use of alarm types

Page 14

Myth: Counting Alarms is Alarm Management

Yes, and weighing myself will get rid of my extra pounds!

Alarm Analysis is an essential part of alarm management, but is only a


tool to identify problems that require work to correct.

Some important Alarm System Performance Measurements:

Alarms Per Day


Annunciated and Suppressed

Chattering Alarms (3 Alarms in 1 Minute)

400
353

350

Alarms Per 10 Minutes

300

Alarm Floods

250
Count

Alarm Priority Distribution

197

200

172
144

150

135
109

92

73

FF
PA
NR
LL
M
75
41
,O
FF
P
N
AL
RM
75
42
,O
FF
SA
NR
LL
M
06
00
,O
FF
PA
NR
M
L7
53
9,
O
FF
NR
FC
M
15
17
,B
AD
PA
PV
L7
54
0,
O
FF
NR
M
PD
I0
10
5,
PV
LO

V
DP

Stale Alarms

BA

SA
L0
60
0,

Alarms By Type

VH
I

94

0
I0
10
1,

Chattering Alarms

106

50

19
56
,P

Most Frequent Alarms

YI

100

Count

Page 15

Example: Alarms Per Day Annunciated and Suppressed


Recorded Alarms Per Day
6000

147 Tags with 483 Alarms are


Suppressed

Recorded Alarms
Annunciated Alarms

5000

'Manageable' (300/day)

4000

'Acceptable' (150/day)

3000
2000

Alarm Suppression,
often uncontrolled

1000
0
56 Days Between Oct 12, 2003 and Dec 28, 2003 -

Uncontrolled Suppression: NOT the way to solve an


alarm problem!
Page 16

Example: Alarms Per 10 Minutes


Annunciated Alarms per 10 Minutes
700

600

Highest 10m inute Rate =


852

Peak Exceed 700

Alarm Flood =
10+ in 10
minutes

Alarm floods begin when


alarms rates exceed 10
alarms in 10 minutes

Alarms rates seen from


>1,000 to >5,000 alarms in
10 minutes.

500

400

300

Bursts in the hundreds are


common.

200

100

During a flood, important


alarms are very likely to be
overlooked

0
- 42 Days -

Page 17

Example: Alarm Floods Count and Duration


Alarm Flood Analysis

Alarm Floods - Alarm Count


1000
900

340 Separate Floods

Number of Floods

340

Highest Count in an
Alarm Flood = 2787

Floods Per Day

3.8

Longest Duration of
Flood = 4.5 Hours

Total Alarms in All Floods

30,447

Average Alarms per Flood

90

Exceeds 1000!

800
700
600
500

Highest Alarm Count in a Flood

2,787

Percentage of Alarms in Floods


vs. All Annunciated Alarms

71.5%

400
300
200

Total Duration of Floods, in


Hours

100

Percentage of Time Alarm


System is in a Flood
Condition

0
- Analysis Period 90 Days-

Alarm Systems in flood have little protective capacity


and interfere with managing an abnormal situation
Page 18

149

6.90%

Example: Most Frequent Alarms


98% of this
systems alarm
events come
from only 10
alarms!

Top 10 Most Frequent Annunciated Alarms


180000

100.0

160000

90.0
80.0

140000

100000

50.0
80000

40.0
60000

30.0

40000

20.0

20000

Page 19

0.0
43FC155.PVLO

43MV018.CMDDIS

43MV010.CMDDIS

43MV022.CMDDIS

43MV018.BADPV

43MV010.BADPV

43PAH397.OFFNRM

43MV024.BADPV

43MV006.BADPV

10.0
43MV022.BADPV

Alarm Count

60.0

Cumulative %

70.0

120000

Normal
situation is 20%
to 80%!
All can be fixed

Step 3: Fix Your Bad Actor Alarms!

Top 10 Most Frequent Annunciated Alarms


180000

100.0

160000

90.0

Chapter 14: Common


Alarm Problems and How
to Solve Them

These methods are easy


to learn and apply!

60.0
100000

50.0
80000

40.0
60000

Cumulative %

70.0

120000

30.0

40000

20.0

20000

43FC155.PVLO

43MV018.CMDDIS

43MV010.CMDDIS

43MV022.CMDDIS

43MV018.BADPV

43MV010.BADPV

43PAH397.OFFNRM

43MV024.BADPV

43MV006.BADPV

10.0
43MV022.BADPV

Alarm Count

The top 10 alarms


usually make up 20% to
80% of the entire alarm
system load

80.0

140000

0.0

Page 20

BAD ACTOR Alarms: Expected Gain


Common Nuisance Alarm Types:
PAS Bad
Actor Alarm
Work Process
Results

Reduction from
PAS Bad Actor
Recommendations

Chattering Alarms

Fleeting Alarms

System 1

339,521

325,423

95.8%

System 2

225,668

133,307

59.1%

Stale Alarms

System 3

414,887

333,395

80.4%

System 4

64,695

46,749

72.3%

Duplicate Alarms

System 5

93,848

71,372

76.1%

System 6

79,434

72,935

91.8%

Nuisance Diagnostic Alarms

System 7

482,375

413,094

85.6%

System 8

644,487

593,904

92.2%

System 9

183,312

77,417

42.2%

System 10

106,212

38,566

36.3%

System 11

91,686

29,188

31.8%

System 12

39,305

8,625

21.9%

System 13

33,115

22,646

68.4%

System 14

44,527

24,882

55.9%

System 15

58,049

51,782

89.2%

System 16

13598

4138

30.4%

System 17

21071

8516

40.4%

System 18

20739

13152

63.4%

System 19

5567

2247

40.4%

System 20

1271

868

68.3%

Alarms that do not represent


events requiring Operator
Action
Average system load
improvement is ~60%
from resolving Bad Actor
alarms

Page 21

Baseline
Alarms

% Reduction

Step 4: Alarm Documentation and Rationalization


Alarm Rationalization: A Rigorous, Effective,
Best Practice Methodology That Achieves
Excellent Results When Done Properly

Quotes from operators after alarm


system improvement projects:
Finally the alarm system makes sense.
The alarm system is useful now. It sure wasnt
before.
You can understand the alarms now they
have real meaning.
Im not constantly dealing with a bunch of
incomprehensible alarms anymore.
The alarm system is now under control!

Page 22

Fix problems while


they are small
dont wait until they
get big!

Step 4: Alarm Documentation and Rationalization


Alarm Rationalization:

Insures your actual alarms comply with your alarm philosophy


(operator actions, priorities, time to respond, etc.)

Documents your alarms (Trip Points, Causes, Consequences,


Corrective Actions), creating a Master Alarm Database for
Operator Information
Audit / Enforce
and Management
of Change
Dynamic State-Based
Alarm Management

Page 23

Alarm Documentation & Rationalization Methodology


A team-based effort involving people with knowledge of your process.
P&IDs and
Operating
Graphics

Alarm and
Control
Configuration

ESD / APC
Experts
Process History
1.0

0.8

Alarm
Statistical
Analysis

0.6

MW

Process
History

0.4

0.2

Data Points

SOP
EOP
HAZOP
Etc

16

14

12

10

31

29

27

25

23

21

19

17

15

13

11

0.0

Plant Experience & Knowledge


Process, Equipment, Operations, Procedures
Board Operators
Process & Control Engineers
Safety, Health, Environmental
Production & Maintenance Engineers
D&R Software
Tools

Myth:

You can buy Alarm Rationalization.


Wrong! You can get experienced help, but only you have the
necessary detailed knowledge of your process!
Page 24

Alarm Priority Determination


Typical Grid-Based Priority Determination:
Impact
Category

NONE

MINOR

Personnel

No
injury or
health
effect

Slight injury (first aid) or Lost time recordable


Lost time injury, or
is the primary
healthAlarms
effect, nowhere operator
but no action
permanent
workermethod
disabling,by
or
which no
harm
a person
is avoided
shall be
configured
at life
the
disability,
lost to
time
disability.
Reversible
severe
injuries, or
recordable
health effects
highest
DCS(such
priority threatening
as skin irritation).

Public or
Environment

No
effect

Minimal exposure. No
impact. Does not cross
fence line. Contained
release. Little, if any,
clean up. Source
eliminated. Negligible
financial consequences.

Costs or
Value of
Production
Loss

No loss

Event costing <$10,000,


notification only at
Department Head level

MAJOR

SEVERE

Exposed to hazards
that may cause injury.
Hospitalizations and
medical first aid
possible. Damage
Claims.
Contamination
causes some nonpermanent damage.

Uncontained release of
hazardous materials with
major environmental
impact and 3rd party
impact. Exposed to lifethreatening hazard.
Disruption of basic
services. Impact involving
the community.
Catastrophic property
damage. Extensive
cleanup measures and
financial consequences.

Event costing $10,000


- $100,000,
notification at Site
Manager level

Event costing >$100,000,


notification above Site
Manager level

AlarmPriority Determination
Severity of Consequences
Time Available None
Minor
Major
Severe
Re-engineer
the Alarm forNo
Urgency
> 30 Min No Alarm No Alarm
No Alarm
Alarm
LOW
HIGH
10 - 30 Min No Alarm LOW
3 - 10 Min No Alarm LOW
HIGH
HIGH
<3 Min
No Alarm HIGH
EMERGENCY EMERGENCY
Page 25

Severity of
Consequence,
Plus:
Time Available to Respond
> 30 Minutes
10 - 30 Minutes
3 - 10 Minutes
<3 Minutes

Determines
Alarm Priority

Myth: Alarm Management is about Getting Rid of Alarms

In Alarm Rationalization, you will get rid of many alarms.


That is a side effect of the initial poor configuration.

Alarm Rationalization is about getting the alarm settings


right.
To ensure alarms are engineered properly

100%

To ensure consistency in alarm settings


80%

To eliminate duplicate alarms


To ensure proper and meaningful
Priority and Alarm Trip Point settings

Alarm Priority
#3
#2
#1
98%

80%

60%
40%

Experienced and targeted


consulting services can be
valuable when learning how to do
D&R.

Page 26

20%

15%
5%

1%

1%

0%
PAS/EEMUA/ASM
Best Practice

The Easy
Way

Not-So-Great Alarm Designs Present and Past

PRESENT:
Past: The 1201 alarm
almost cost the U.S.
over 1 billion dollars.

One of the worst


alarm designs in
history!

Page 27

Step 5: Alarm Settings Audit and Enforcement


True False

?
?

True or False?

Your Operators do not have the keys or


passwords that enable then to change
alarm settings.
Your engineers would never make an
improper change in your control system.

Your maintenance personnel wouldnt even


think of changing your alarm system, even
if the operators ask.

Control Systems Contractors working on-site


would never alter the system, even if asked
by someone who signs their check.

If all are TRUE, you dont need to audit / enforce your


alarm settings
Page 28

Step 5: Alarm Settings Audit and Enforcement


Typical Data:

Company: No one here


changes alarms without
getting authorization and
following MOC!

Me: Have you seen this


data?

Company: Uh... That


must have been part of a
project!

Me: These changes


were typically done
between midnight and 6
AM.

Company: Hmmm
maybe we do have a
problem

Examine your own data!

Summary of Changes in Alarms


Type of Change

Quantity During Analysis Period

Alarm Suppression

79

Alarm Trip Points

181

Alarm Priority

92

Tag Range

121

Tag Execution Status

175

Total

648

Average Per Day

5.6

Page 29

Alarm Audit and Enforce


Master Alarm
Database

Audit alarm values


from DCS,
compare to Master
Alarm database

Re

Generate
Exception
Reports

ad

Wr
it e

Optional
and with
Control:
Enforce alarm
settings to
DCS

The foundation for other advanced


alarm management techniques
Page 30

Alarm Shelving

The safe, controlled, and effective way to temporarily


suppress alarms

Generally beyond the capability of a DCS as-shipped.

Addresses concerns about DCS alarm suppression:


All Shelved alarms are visible
and cannot be forgotten about
Limit the time an alarm can be
out of service
Shelves individual alarms, not all
alarms on a tag
Tracking of all shelved alarms,
with reports
Security allows shelving, but not
other alarm changes.

Page 31

Does One Size Fit All?


STATE-BASED ALARMING
IF Your Process:

Makes Multiple Products or Grades

Uses Multiple Differing Feedstocks

Has Parallel Operating Trains

Has Different Modes of Operation

Runs at Different Rates

Detect Plant
State Change

Automatically
Alter Alarm
Settings to
Match New
State

Then:
Dont have only ONE set of unchanging,
compromise alarms settings for your alarms.
State-based alarming technology, lets you have
multiple alarm settings that are optimum and
correct for all your operating conditions.

Page 32

Alarm Flood Suppression Equipment Trips


Compressor

States:
RUNNING (default)
and
TRIPPED

Detect the TRIPPED state, and immediately


address the following expected diagnostics
plus closely related, expected process alarms:
Low Flow
Low Discharge Pressure
High Suction Pressure
Low Oil Pressure
Low Amps
Low Speed
Several BAD VALUE alarms

Post-Shutdown, the
important alarms
are from the
remainder of the
process as it adjusts
to the loss of the
compressor.

Diagnostics are a
temporary
distraction.

and so forth
Page 33

Step 7: Control and Maintain your Improved Performance


FACT: A single unscheduled shutdown can wipe out all
the benefits realized from APC and Optimization!

FACT: A few slightly-worse-than-normal production loss


incidents can do the same thing.
Optimum Profitability

Maximum Profitability Region

Optimum Profitability
from APC &
APC Optimization
& Optimization

Plant Profitability

Normal Operating Region

Profitable Region

Break Even Point

Net Loss Due to


Minor
Process Upset
Substantial Net Loss
Due to Unscheduled
Plant Shutdown

Time
Page 34

And while
were at it

Page 35

Lets fix some of these TERRIBLE Graphics!


21.8

9.9

9.8
20.0

BAD

71.6

85.5

20.2
99.9

99.9

93.4

0.0
15.9

0.0

but thats another book entirely


Page 36

The Main Myths of Alarm Management

You dont need an Alarm Philosophy

Alarm Management is about Software!

Alarm Management is about Counting Your Alarms

Alarm Management is about Getting Rid of Alarms

Alarm Management is something you can buy

Alarm Management is about Endless Consulting Services

Page 37

Key Points

Massively overloaded alarm systems are a


common problem everywhere!

They will occur wherever DCS systems are


configured and maintained without a
comprehensive alarm philosophy, documenting
how to do alarms right.

Such systems are proven significant contributing


factors to minor upsets and even major accidents.

The solutions to the problems are well known and


fully documented.

Available at
www.pas.com
And at

Page 38

Q&A

Any Questions?

Bill Hollifield (Bhollifield@pas.com)

www.pas.com (281) 286-6565

Page 39

You might also like