Practice Service-Continuity-Management

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Service continuity

management
ITIL® 4 Practice Guide
AXELOS.com

25th
February
2020

AXELOS Copyright View Only – Not for Redistribution© 2020


2 Service continuity management PUBLIC

Contents
1 About this document 3
2 General information 4
3 Value streams and processes 19
4 Organizations and people 33
5 Information and technology 38
6 Partners and suppliers 42
7 Important reminder 43
8 Acknowledgments 44

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 3
View Only – Not for Redistribution
© 2020

1 About this document


This document provides practical guidance for the service continuity management practice. It is
split into five main sections, covering:

● general information about the practice


● the practice’s processes and activities and their roles in the service value chain
● the organizations and people involved in the practice
● the information and technology supporting the practice
● considerations for partners and suppliers for the practice.

1.1 ITIL® 4 QUALIFICATION SCHEME


Selected content of this document is examinable as a part of the following syllabus:

● ITIL Specialist High-Velocity IT

Please refer to the relevant syllabus document for details.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
4 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

2 General information
2.1 PURPOSE AND DESCRIPTION
Key message

The purpose of the service continuity management practice is to ensure that the availability and
performance of a service are maintained at sufficient levels in case of a disaster. The practice
provides a framework for building organizational resilience with the capability of producing an
effective response that safeguards the interests of key stakeholders and the organization’s
reputation, brand, and value-creating activities.

Definition: Disaster

A sudden unplanned event that causes great damage or serious loss to an organization. To be
classified as a disaster, the event must match certain business-impact criteria that are
predefined by the organization.

The service continuity management practice helps to ensure a service provider’s readiness to
respond to high-impact incidents which disrupt the organization’s core activities and/or
credibility.

Ensuring service continuity is becoming more important and difficult. The service continuity
management practice is increasingly important in the context of digital transformation, because
the role of digital services is growing across industries. Major outages of services may have
disastrous effects on organizations that, in the past, focused on non-technological disasters.

Wider use of cloud solutions and wider integration with partners’ and service consumers’ digital
services are creating new critical dependencies that are more difficult to control. Partners and
service consumers usually invest in high-availability and high-continuity solutions, but a lack of
integration and consistency between organizations creates new vulnerabilities that need to be
understood and addressed.

The service continuity management practice, in conjunction with other practices (including the
availability management, capacity and performance management, information security
management, risk management, service design, relationship management, architecture
management, and supplier management practices, among others), ensures that the organization’s
services are resilient and prepared for disastrous events.

The concept of risk is central to the service continuity management practice. This practice usually
mitigates high-impact, low-probability risks which cannot be totally prevented (because some risk
factors are not under the organization’s control, such as natural disasters).

In the simplest terms, this practice is much like the incident management practice, except that
the potential for damage is much higher and it may threaten the service provider’s ability to
create value.

The service continuity management practice is closely related to, and in some context may be
merged with, the availability management practice within the service value system (SVS). It is also

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 5
View Only – Not for Redistribution
© 2020

closely related to, and may be incorporated into, the business continuity management practice in
a corporate context.

In a service economy, every organization’s business is service-driven and digitally enabled. This
may lead to a full integration of the disciplines because the business continuity management
practice is concerned with the continuity of digital services and service management. This
integration is possible and useful where digital transformation has led to the removal of the
borders between ‘IT management’ and ‘business management’ (see ITIL® 4: High-Velocity IT for
more on this topic).

2.2 TERMS AND CONCEPTS


Definition: Service continuity

The capability of the service provider to continue service operation at acceptable predefined
levels following a disaster event or disruptive incident.

For internal service providers, the main objective of the service continuity management practice is
to support the overall business continuity management practice by ensuring that, through
managing the risks that could affect IT services, the service provider can always provide the
relevant agreed service levels.

For external service providers, service continuity management equals business continuity
management.

Business continuity professionals are also interested in dealing with such business crises as adverse
media attention or disruptive market events. However, in this practice guide, the scope of the
service continuity management practice is limited to operational risks.

2.2.1 Disaster (or disruptive incident or crisis)


ISO defines a disaster as ‘a situation with a high level of uncertainty that disrupts the core
activities and/or credibility of an organization and requires urgent action’ 1.
It is usually a good idea to explicitly define the list of events which are considered to be disasters.
Doing so helps when developing a proper set of service continuity plans, which ensures
organizational readiness for disruptive events.

1
ISO 22300:2012

AXELOS Copyright
View Only – Not for Redistribution
© 2020
6 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

A list of disasters generally includes:


● cyber attacks
● electricity outages
● failures of strategic partners
● fires
● floods
● key personnel unavailability
● large-scale IT infrastructure failures (such as data-centre failures)
● natural disasters.
Defining those events which are not disasters is equally important. Usually, the service continuity
management practice does not cover:

● Minor failures. Failures should be considered minor or major based on business impact. It is
important to consider factors such as the service actions that are affected, the scale of failure,
time of failure, and so on 2.
● Strategic, political, market, or industry events.
To successfully recover from a disaster, a service provider should define the service continuity
requirements. Service continuity requirements include:

● recovery time objective (RTO)


● recovery point objective (RPO)
● minimum service continuity levels (see Figure 2.1).

Figure 2.1 Service continuity requirements: RTO, RPO, minimum target service level

2
See the Availability management practice guide for details.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 7
View Only – Not for Redistribution
© 2020

2.2.2 Recovery time objective


Definition: Recovery time objective

The maximum period of time following a service disruption that can elapse before the lack of
business functionality severely impacts the organization. This represents the maximum agreed
time within which a product or an activity must be resumed, or resources must be recovered.

The main factors that should be considered in estimating the RTO are:

● the reduction in a service provider’s ability to deliver services and the costs associated with this
reduction
● Service level agreement fines and regulatory judgments
● losses associated with diminished competitive advantage and reputation.
Business continuity professionals also use the term ‘maximum tolerable period of
disruption/maximum acceptable outage (MAO)’ and distinguish them from the RTO.

ISO 22301:2012 provides the following definitions:


● MAO The time it would take for adverse impacts, which might arise as a result of not providing
a product/service or performing an activity, to become unacceptable.
● RTO The period of time following an incident within which a product or an activity must be -
resumed, or resources must be recovered.
Following this logic, the RTO should be less than the MAO by an amount which accounts for the
organizational risk appetite 3. The MAO should be identified during business impact analysis. RTO
should be defined during the development of service continuity plans.

2.2.3 Recovery point objective


Definition: Recovery point objective

The point to which the information that is used by an activity must be restored in order to enable
the activity to operate effectively upon resumption.

RPO defines the period of time of acceptable data loss. If the RPO is 30 minutes, there should be
at least one backup 30 minutes prior to a disruptive event so that, when the service is recovered,
the data from the time 30 minutes or less prior to the disruptive event will be available when
service delivery is resumed.
The main factors that should be considered in estimating the RPO are:
● criticality of the service that used the data
● criticality of the data
● data-production rate.
For example, an online shop takes 100 orders per hour. Executives say that losing 200 orders would
be unacceptable. Therefore, the RPO is 2 hours.
The RPO defines the requirement for backup frequency. Backup management must ensure the
availability of recent backup copy in case of disaster.

3
BCI Good practice guidelines 2013

AXELOS Copyright
View Only – Not for Redistribution
© 2020
8 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

2.2.4 Minimum target service level


Definition: Minimum target service level

The level of service which is acceptable to the service provider to achieve its objectives during a
disruption. 4

While recovering from a disaster, a service provider should usually provide the service at some
minimum target service level. Even though there are no specific requirements from the customer,
achieving a minimum service level can help to minimize losses.
The minimum target service level is usually defined in terms of:
● list of specific service actions and functionality points that should available to the users during a
disruption
● limited number of users or specific group of users who should have access to the service during a
disruption
● limited number of transactions per time period that users should be able to process during a
disruption.

2.2.5 Business impact analysis


Definition: Business impact analysis

A key activity in the practice of service continuity management that identifies vital business
functions (VBFs) and their dependencies. These dependencies may include suppliers, people, other
business processes, and IT services. Business impact analysis defines the recovery requirements for
IT services. These requirements include RTOs, RPOs, and minimum target service levels for each IT
service.

Business impact analysis (BIA) is a process of analysing activities and the effect that a disruption
might have on them 5.
According ISO 22301, business impact analysis should include:
● identifying activities that support the provision of products and services
● assessing the impacts over time of not performing these activities
● setting prioritized timeframes for resuming these activities at a specified minimum acceptable
levels, considering the time within which the impacts of not resuming them would become
unacceptable
● identifying dependencies and supporting resources for these activities, including suppliers,
outsource partners, and other relevant interested parties.

2.2.6 Service continuity/disaster recovery plans


Definition: Service continuity

A set of clearly defined plans related to how an organization will recover from a disaster and
return to a pre-disaster condition, considering the four dimensions of service management.

Service continuity plans guide the service provider when responding, recovering, and restoring a
service to normal levels following disruption.

4
ISO 22301:2012
5
BCI Good practice guidelines 2013

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 9
View Only – Not for Redistribution
© 2020

Service continuity plans usually include:


● Response plan This defines how the service provider initially reacts to a disruptive event in
order to prevent damage, such as in cases of fire or cyber-attack.
● Recovery plan This defines how the service provider recovers the service in order to achieve
the RTO and RPO.
● Plan of returning to normal operations This defines how the service provider resumes normal
operations following recovery. For example, if an alternative data centre has been in use, then
this phase will bring the primary data centre back into operation and restore the ability to
invoke IT service continuity plans again.
In many a case, there is also a need for business continuity planning. Business continuity plans may
include:
● emergency response to interface with all emergency services and activities
● evacuation plan to ensure the safety of personnel
● crisis management and public relations plan plans for the command and control of different
crises and the management of the media and public relations
● security plan showing how all aspects of security will be managed on all home sites and
recovery sites
● communication plan showing how all aspects of communication will be handled and managed
with all relevant areas and parties involved during a major incident.
These plans are usually developed as part of the business continuity management practice.

2.3 SCOPE
The service continuity management practice includes the following areas:

● performing BIA to quantify the impact of service unavailability to the service provider and
service consumers
● developing service continuity strategies (and integrating them into the business continuity
management strategy, if relevant). This should include elements of risk-mitigation measures as
well as the selection of appropriate, comprehensive recovery options
● developing and managing service continuity plans (and providing a clear interface to business
continuity plans, if relevant)
● performing exercises and testing the service continuity plans invocation in case of disaster.
There are several activities and areas of responsibility that are not included in the service
continuity management practice, although they are still closely related to service continuity
management. These are listed in Table 2.1, along with references to the practices in which they
can be found. It is important to remember that ITIL practices are merely collections of tools to use
in the context of value streams; they should be combined as necessary, depending on the
situation.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
10 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 2.1 Activities related to the service continuity management practice described in other practice
guides

Activity Practice guide

Communicating with customers to align the customer’s Relationship management


business continuity strategy and plans with service
provider’s service continuity strategy and plans

Negotiating and agreeing customer requirements for Service level management


service continuity

Designing service continuity solutions as a part of the Service design


service model

Aligning service continuity solutions with business Architecture management


architecture

Identifying risks associated with service continuity Risk management

Establishing and managing contracts with suppliers and Supplier management


partners

Monitoring the availability of services Monitoring and event management

Justifying new service continuity solutions Portfolio management

Implementing risk mitigation measures and changing the IT Project management, change control
infrastructure in order to ensure resilience

Managing and implementing improvements on an ongoing Continual improvement


basis

2.3.1 The line between availability and continuity


The line between the service continuity and availability management practices is subtle. Both
practices involve the concept of risk and work to identify and prepare for events that threaten to
disable services. For both practices, either an understanding of VBFs and risk assessments or a BIA
of service failures is required. Ultimately, both practices ensure the organization's resistance to
failures.

Some organizations prefer not to separate the management of availability and continuity.
However, there are some differences between the two practices, outlined in Table 2.2, that
should be considered when designing a service management system.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 11
View Only – Not for Redistribution
© 2020

Table 2.2 Distinction between Availability Management and Service continuity management

Availability management Service continuity management

Focus on high-probability risks Focus on high-impact risks (emergencies,


disasters)

More proactive More reactive

Reduces the likelihood of unwanted events Reduces the impact of unwanted events

Focus on technical solutions Focus on organizational measures

Optimization Creating redundancy

Not a part of the corporate function Often a part of the corporate function

Business as usual Exceptional circumstances

MTRS, MTBF, MTBSI RTO, RPO

The service continuity management practice does not cover minor or short-term failures that do
not seriously impact the organization. It focuses on risks associated with significant damage,
regardless of how likely or unlikely they are to occur. Often, these are emergency situations: fires,
floods, power outages, data centre failures, and so on. Although the availability management
practice does not ignore the negative impacts of failures on the service provider and consumer,
minor interruptions of individual components are also considered in the process.

There is a tension between the objectives of the practices. The availability management practice
works with statistics and analyses trends; continuity management is concerned with how to
respond to disruptive events.

Availability planning focuses on fulfilling current and future agreed requirements and avoiding
deviations. The availability management practice finds and eliminates single points of failure; the
countermeasures that are implemented are generally proactive and they reduce the likelihood of
unwanted events. The service continuity management practice focuses on planning to manage the
serious consequences of disruptive events. Backup sites, transitioning to alternative methods of
service provision, and recovery procedures all reduce damage, but generally do not impact the
probability of an incident.

2.3.2 Incident management


The activities of the incident management practice are very similar to those of the service
continuity management practice. However, the incident management practice focuses on failures
which do not threaten the organization’s resilience, whereas the service continuity management
practice focuses on high-impact failures which can prevent the organization from resuming service
delivery.

Again, the line between these two practices is subtle and should be clearly defined in terms of
impact to the service provider and service consumers. At the same time, in some cases (usually in
small, single-site service providers) service continuity activities may be performed as a part of
major incident management.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
12 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

When service continuity plans are in place and managed separately from incident management
activities, there should be a clear criterion for triggering service continuity procedures. When
assessing the business impacts of an incident, support specialists should determine whether the
major incident may lead to a disaster and inform the crisis management group so that they can
make a decision about invocation.

Definition: Invocation

The act of declaring that a service provider’s service continuity plans must be enacted in order to
continue service delivery.

2.3.3 The role of the service continuity practice when managing risks
The concept of risk is central to the service continuity management practice. This practice
generally focuses on mitigating high-impact, low-probability risks which cannot be totally
prevented.

In order to mitigate risks, this practice focuses on minimizing expected losses so that, when
disasters happen, they do not cause significant damage.

To ensure readiness regarding disruptive events, the service continuity management practice
needs information about risks, which can be obtained through the risk management practice.

An effective service continuity management practice can contribute significantly to the


organization’s risk management. A large number of risk-mitigation measures are related in some
way to service-continuity options.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 13
View Only – Not for Redistribution
© 2020

2.4 PRACTICE SUCCESS FACTORS


Definition: Practice success factor

A complex functional component of a practice that is required for the practice to fulfil its
purpose.

A practice success factor (PSF) is more than a task or activity, as it includes components of all four
dimensions of service management. The nature of the activities and resources of PSFs within a
practice may differ, but together they ensure that the practice is effective.

The service continuity management practice includes the following PSFs:

● developing and managing service continuity plans


● mitigating service continuity risks
● ensuring awareness and readiness.

2.4.1 Developing and managing service continuity plans


To effectively respond to and recover from disasters, a service provider needs service continuity
plans, which should reflect the chosen service continuity strategies. The service continuity
strategies should be selected with respect to the service continuity requirements, which are
identified during BIA.
Therefore, in order to develop and manage service continuity plans, the service provider should
first perform BIA, then select the proper set of service continuity requirements, then define the
service continuity strategy.
The Business Continuity Institute (BCI) defines the following continuity strategies 6:
● diversification
● replication
● standby
● post-incident acquisition
● do nothing
● subcontracting.
These are not one-time activities, so long as the service continuity requirements and the context
of the service provider are changing; for example, when a service provider begins delivering their
service to a new consumer. This event is a trigger for re-performing the BIA and updating the
service continuity strategies. If there are no significant changes for a long period, BIA is generally
performed once or twice a year and synchronized with risk assessment cycles. For more detail on
BIA, refer to section 3.2.2.

2.4.1.1 Continuity plans


BCI introduces three levels in the response and recovery planning structure: strategic, tactical,
and operational 7, as shown in Table 2.3.

6
BCI Good practice guidelines 2013
7
BCI Good practice guidelines 2013

AXELOS Copyright
View Only – Not for Redistribution
© 2020
14 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 2.3 Levels in the response and recovery planning structure

Level Description

Strategic How executives make decisions about recovery process, communicate with
external parties (including media, if relevant), and deal with any situations that
are not covered in service continuity plans

Tactical How management coordinates the recovery process in order to ensure the
appropriate allocation of resources according to priorities (current business
priorities, seasonal changes, and so on) and manage conflicts between the
planning and recovery teams

Operational How teams perform recovery activities, including responding to disruptive


events, recovering to pre-defined levels of service, and/or providing alternative
facilities to continue operations
Depending on the scale of the organization and whether the service provider is internal or
external, there may be different solutions for structuring the plans; the responsible body may also
vary.
Depending on the type of service provider and the scale of the organization, the structure of the
service continuity plans may be more or less complex. Some common structures are outlined in
Table 2.4.
Table 2.4 Continuity plans structure options

Small-scale organization Large-scale organization


Internal In an IT department of a small-scale Strategic: a crisis management plan performed
service organization, there may not be any by executives. It is usually part of the business
provider service continuity plans. All continuity plan.
continuity arrangements may be
Tactical: a number of plans, each one covering a
managed as a part of business
product, service, business unit, site, or location,
continuity management.
each with its own recovery team. Tactical IT
Specific IT service continuity department activities may be included in the
activities may be performed as part business continuity plan, but they are more
of the incident management commonly designed as separate related plans.
practice.
Operational: a number of detailed procedures
for specific recovery activities (such as restoring
application data from a backup). Other
departments may have their own specific
operational instructions as a part of continuity
plans.
External All levels (strategic, tactical, The description of continuity plans levels is
service operational) might be implemented similar to above, but the service provider is
provider as a single plan with a single team accountable for all levels.
covering all aspects of response and
recovery.

Service continuity plans should cover the stages outlined in Table 2.5 following a disaster.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 15
View Only – Not for Redistribution
© 2020

Table 2.5 Stages of response and recovery

Stage Response Recover Restore

Plan Response plan Recovery plan Plan of returning to normal


operations

Content Events and scenarios which Recovery team members Documented criteria to
should trigger service continuity contacts return to normal
plans operations
Guidelines for
Crisis management group coordinating recovery Detailed descriptions of
contacts teams returning-to-normal-
operations procedures
Procedure for initial response Detailed description of
and minimizing potential losses. recovery procedures Instructions for restoring
There generally are procedures recovery site (if relevant)
Guidelines for monitoring
for specific scenarios (such as
and sharing information
fires or power outages)
across the organization
Documented criteria for
Escalation procedures
choosing a recovery option (if
relevant)
Communication procedures,
including communications with
customers, partners, and
employees
Documented triggers for
invocation

Plans should be clear, concise, and action oriented. Generally, they should exclude information
that does not directly apply to the recovery teams that use them. Procedures should be time-
based and include information about possible delays and interrelations between plans and teams.
For details about the organizational structure of response and recovery, see section 4.2.

2.4.2 Mitigating service continuity risks


The service continuity management practice includes the definition and management of controls
to manage a wide range of risks. For this, it is used in conjunction with the risk management
practice and other risk-focused practices (such as the capacity and performance management,
availability management, and information security management practices). Agreed availability
controls should be implemented through the service design, software development and
management, and infrastructure and platform management practices 8.

The service continuity options outlined in Table 2.6 may be designed and implemented as a part of
the overall risk mitigation plan.
Table 2.6 The four dimensions of the service continuity management practice

Service management dimension Service continuity measures


Organizations and people
• Managing people during disasters
• Using alternative sites and facilities

8
Risk management practice

AXELOS Copyright
View Only – Not for Redistribution
© 2020
16 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Information and technology


• Physical security
• Resilient telecommunication network
• Data protection in operation: using RAID arrays, SAN, and
so on to ensure the availability of data
• Data backup
• Fault-tolerant applications
• Monitoring to provide prompt alerts

Partners and suppliers


• Reciprocal agreements
• Outsourcing services to multiple providers
• Fire detection systems or suppression systems as a service

Processes and value streams


• Manual operations and alternative methods of service
delivery
• Plans and procedures for response and recovery (service
continuity plans)

If BIA of a service indicates an earlier and higher impact, more preventive measures need to be
adopted. If the initial impact is lower and develops slowly, a more economically effective
approach is to invest in continuity and recovery countermeasures.
When choosing service continuity measures, the effectiveness and efficiency of each option should
be assessed 9. It is also important to continually control and validate their ongoing effectiveness
and efficiency.
● Effectiveness According to risk management principles, the effects of a service continuity
measure should be assessed and compared to the expected losses of the disruptive event.
● Efficiency The cost of the service continuity measure should be assessed and compared to the
benefit. The benefit is calculated by estimating the reduction in the probability of the
disruptive event occurring after the measure is implemented and multiplying it by the
expected impact to the service provider and customers if the event occurs. This value, in
terms of cost, should be compared to the cost of the measure’s implementation. Cost benefit
analysis can be used here.

2.4.3 Ensuring awareness and readiness


Recovery plans that have not been tested, often do not work as intended, if at all. Testing is
therefore a critical part of service continuity management and the only way of ensuring that the
selected strategy, implemented measures, and plans are actually working.
Testing service continuity plans is the way to check and increase readiness. By regularly revising
the plans and procedures, recovery teams discover flaws and inefficiencies, then update the
service continuity plans in order to reflect their findings.
BCI defines the following types of exercises 10:
● walkthrough
● table-top exercises
● command-post exercises
● live
● test.

9
For details see Risk management practice.
10
BCI Good practice guidelines 2013

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 17
View Only – Not for Redistribution
© 2020

The key characteristics and the purpose of each type, according the BCI Good practice Guidelines
2013, are outlined in Table 2.7.
Table 2.7 Exercise types

Exercise type Key characteristics Purpose

Walkthrough Allowing recovery team


• Discussion-based exercises
members to meet for the
• Unpressurized environment
first time
• Usually focuses on a specific area for
improvement Exploiting improvement
opportunities

Table-top Improving knowledge of


• Discussion based on a given scenario
exercises the plans
• Usually run in real time, but may include ‘time-
jumps’ to allow different phases of the scenario
to be exercised

Command-post Testing communication,


• Recovery team members are given information
exercises decision making, and
in a way that simulates a real incident and are
coordination
invited to respond

Live Testing the ability to


• The most realistic way to test plans
achieve RTO, RPO, and
• May range from a small-scale rehearsal of the
minimum target service
recovery of one component to a full-scale
levels in case of a
rehearsal of the recovery of the whole service or
disruptive event
organization
• Usually includes participating interested parties

Test Testing service component


• It is usually applied to specific hardware or
recovery when there is a
software, such as restoring application data from
higher risk of failure
backup
• According ISO 22301, a test is a unique and
particular type of exercise, which incorporates
an expectation of a pass or fail element within
the goal or objectives of the exercise being
planned

Exercises should be conducted at planned intervals and when there are significant changes which
may impact the recovery. The higher the possible impact of service outage, the higher the
frequency of exercising should be.
Exercising is not only a way of ensuring readiness, it is an improvement opportunity. So it is
generally a good idea to analyse the findings made during the testing and overall recovery team
performance, then produce exercise reports that include findings and recommendations.

2.5 KEY METRICS


The effectiveness and performance of the ITIL practices should be assessed within the context of
the value streams to which each practice contributes. As with the performance of any tool, the
practice’s performance can only be assessed within the context of its application. However, tools
can differ greatly in design and quality, and these differences define a tool’s potential or
capability to be effective when used according to its purpose. Further guidance on metrics, key

AXELOS Copyright
View Only – Not for Redistribution
© 2020
18 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

performance indicators (KPIs), and other tools that can assist with this can be found in the
measurement and reporting practice guide.

Key metrics for the service continuity management practice are mapped to its PSFs. They can be
used as KPIs in the context of value streams to assess the contribution of the practice to the
effectiveness and efficiency of those value streams. Some examples of key metrics are given in
Table 2.8.

Table 2.8 Example metrics for practice success factors

Practice success factor Example metrics

Developing and managing service


● Percentage of products and services with clearly
continuity plans
documented continuity requirements
● Percentage of (critical) products and services with
documented service continuity plans
● Timely updating of service continuity plans
Mitigating service continuity risks
● RTO achievement (real disasters and exercises)
● RPO achievement (real disasters and exercises)
● Percentage of effective continuity measures
● Ratio between actual losses and expected losses
Ensuring awareness and readiness
● Percentage of exercises and awareness sessions that were
performed on schedule
● Percentage of services for which continuity plans are tested
in a given time period (usually last 6 months)
The correct aggregation of metrics into complex indicators will make it easier to use the data for
the ongoing management of value streams, and for the periodic assessment and continual
improvement of the service continuity management practice. There is no single best solution.
Metrics will be based on the overall service strategy and priorities of an organization, as well as on
the goals of the value streams to which the practice contributes.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 19
View Only – Not for Redistribution
© 2020

3 Value streams and processes


3.1 VALUE STREAMS CONTRIBUTION
Like any other ITIL management practice, service continuity management contributes to multiple
value streams. It is important to remember that a value stream is never formed from a single
practice. The service continuity management practice combines with other practices to provide
high-quality services to consumers. The main value chain activities to which the practice
contributes are:

● deliver and support


● design and transition
● improve
● obtain/build
● plan.
The contribution of the service continuity management practice to the service value chain is
shown in Figure 3.1.

Figure 3.1 Heat map of the contribution of the service continuity management practice to value chain
activities

AXELOS Copyright
View Only – Not for Redistribution
© 2020
20 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

3.2 PROCESSES
Each practice may include one or more processes and activities that may be necessary to fulfil the
purpose of that practice.

Definition: Process

A set of interrelated or interacting activities that transform inputs into outputs. A process takes
one or more defined inputs and turns them into defined outputs. Processes define the sequence of
actions and their dependencies.

Service continuity management activities form five processes:

● governance of service continuity management


● business impact analysis
● developing and maintaining service continuity plans
● testing service continuity plans
● response and recovery.

3.2.1 Governance of service continuity management


This process includes the activities listed in Table 3.1 and transforms the inputs into outputs.

Table 3.1 Input, activities, and outputs of the governance of service continuity management

Key inputs Activities Key outputs

● Scope definition ● Service continuity policy


● Business impacts analysis
report(s)
● Policy setting ● Documented roles and
● Risks register(s)
● Awareness and exercise responsibilities
programme development ● Awareness and exercise
● Customer requirements
programme
● Regulatory requirements

● Risk appetite
● Standards

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 21
View Only – Not for Redistribution
© 2020

Figure 3.2 shows a workflow diagram of the process.

Figure 3.2 Workflow for the governance of service continuity management

These activities may be carried out with varying levels of formality by many people in the
organization. Table 3.2 describes these activities further.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
22 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 3.2 Activities of service continuity management

Activity Description

Scope definition Defining the service continuity management practice’s scope ensures clarity
regarding which situations and areas of the organization it covers.

Organizational scope may be limited by products and services, sites and


locations, customers, and so on. Products and services which are legacy or
will be terminated soon are usually excluded from the scope, as are non-
critical, low-margin products and services.

The costs of implementing a service continuity management practice can be


high. Therefore, if a service provider initiates a service continuity
management programme, some services, products, or sites might initially be
excluded as part of a staged implementation.

Many different techniques can be used to define the practice’s scope,


including cost benefit analysis, SWOT analysis, PESTLE analysis, and so on.

When defining scope, organizations should consider:

● previous business impact analysis report(s)


● existing risks register(s)
● customer requirements
● regulatory requirements.
It is also important to define the practice’s scope in terms of disasters.

Policy setting Policy setting includes:

● Documenting the scope.


● Assigning roles and responsibilities. If the service provider only initiates
a service continuity programme, there will be no organizational
structure to support any service continuity plans. In other cases, the
organizational structures of response and recovery teams are usually the
part of the service continuity policy.
● Defining the general approach to service continuity management.
Service continuity policies should clarify the available resources and
limitations that should be considered during BIA.
● Policies should be established and communicated as soon as possible so
that all stakeholders involved in or affected by the service continuity
management practice are aware of the scope, the limitations, and their
responsibilities.
● The scope and policies should be regularly revised (usually once a year).
Revision may be triggered by disruptive events (especially any not
covered by plans), a new service, a new customer, or a new relationship

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 23
View Only – Not for Redistribution
© 2020

with a partner.

Awareness and Testing is a critical part of the overall service continuity management
exercise practice: it is the only way of ensuring that the selected strategy,
programme measures, and plans are working.
development

Education, awareness training, and exercises should be planned to ensure


that all parts of the practice (site, team member, service, or CI) are tested
at least once a year.

Exercise programme should ensure testing all four dimensions of service


management:

● Organizations and people


• The right people with the right skills
• Recovery team members’ knowledge and experience
• Staff are aware of service continuity plans
● Information and technology:
• required equipment works
• required data is available
● Partners and suppliers:
• readiness of third parties involved in response and recovery
to meet service continuity requirements
● Processes and value streams:
• procedures are correct, consistent, and manageable

AXELOS Copyright
View Only – Not for Redistribution
© 2020
24 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

3.2.2 Business impact analysis


This process includes the activities listed in Table 3.3 and transforms the inputs into outputs.

Table 3.3 Inputs, activities, and outputs of the business impact analysis process

Key inputs Activities Key outputs

● VBF identification ● Priority list of VBFs


● Service documentation
● Analysis of the consequences ● Documented impacts from a
● Risk assessment reports
of disruption loss of VBFs
● Financial data of loss of VBFs
● VBF interdependencies ● Documented VBF
● Major incident reports
identification interdependencies
● Service models
● Determination of the service ● Business impact analysis
● Risk management policy
continuity requirements report
● Risk appetite
● Regulatory requirements

Figure 3.3 shows a workflow diagram of the process.

Figure 3.3 Workflow of the business impact analysis process

These activities may be carried out with varying levels of formality by many people in the
organization. Table 3.4 outlines these activities further.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 25
View Only – Not for Redistribution
© 2020

Table 3.4 Activities of the business impact analysis process

Activity Description

VBF identification VBF refers to the part of a service that is critical to the success of the
service provider and/or customers. It is important that the VBFs are
recognized and documented to provide the appropriate focus and
resources allocation.

Many different techniques can be used to identify risks, including


brainstorming, interviews with stakeholders (including customers and
users), analysis of the service documentation, and so on.

If the service provider has an established risk management practice,


information about risk assessments might be useful for understanding the
most critical areas.

Analysis of the When VBFs are identified, the impacts of disruption should be
consequences of determined. This impact could be a ‘hard’ impact that can be precisely
disruption identified, such as financial loss, or a ‘soft’ impact, such as a tarnished
reputation or loss of competitive advantage.

The following forms of loss proposed by FAIR 11 might be considered:

● Productivity: the reduction in a service provider’s ability to deliver


services
● Response: expenses associated with managing a loss event
● Replacement: the intrinsic value of an asset, the expense associated
with replacing lost or damaged assets (for example, purchasing a
replacement server)
● SLA fines and regulatory judgments: legal or regulatory actions
levied against the service provider
● Competitive advantage: losses associated with diminished
competitive advantage.
● Reputation: losses associated with an external perception of the
service provider
Impacts may change over time. A service provider and customers may be
able to function without a particular service or VBF for a short period of
time, but over time the impacts may increase until the service provider
or customers can no longer operate.

One of the key outputs from a BIA exercise is a graph of the anticipated
losses of an IT service or specific VBF over time. This graph is then used

11
An Introduction to Factor Analysis of Information Risk (FAIR)

AXELOS Copyright
View Only – Not for Redistribution
© 2020
26 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

to drive the service continuity strategies and plans.

Losses due to service outages more commonly grow exponentially over


time. Along with losses related to the reduction in an organization’s
ability to generate its primary value proposition, there are also threats
of fines, judgements, and reputational damage.

VBF The interdependencies between VBF and service components and key
interdependencies internal and external resources should be identified and documented.
identification

To do this, the service provider may use service and configuration


models if a configuration management database is in place. Component
failure impact analysis (CFIA) may also be a useful technique. CFIA can
be used for identifying single points of failure, existing redundancies,
and so on.

Determination of Based on the analysis of the consequences of disruption and the


the service identified interdependencies, the service provider should determine
continuity service continuity requirements for each service or VBF within the scope
requirements of service continuity management, including:

● recovery time objective(s)


● recovery point objective(s)
● minimum target service level(s)

3.2.3 Developing and maintaining service continuity plans


This process includes the activities listed in Table 3.5 and transforms the inputs into outputs.

Table 3.5 Inputs, activities, and outputs of the developing and maintaining service continuity
plans process

Key inputs Activities Key outputs

● Service continuity strategy ● New and updated controls


● Business impact analysis
report(s)
development ● Service continuity strategies
● Service continuity plans ● Service continuity plans
● Existing controls
development
● Information about available
resources
● Initial testing of service
continuity plans
● Consumer’s continuity plans
● Service continuity policy

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 27
View Only – Not for Redistribution
© 2020

Figure 3.4 shows a workflow diagram of the process.-

Figure 3.4 Workflow of the developing and maintaining service continuity plans process

These activities may be carried out with varying levels of formality by many people in the
organization. Table 3.6 outlines these activities further.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
28 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 3.6 Activities of the developing and maintaining service continuity plans process

Activity Description

Service continuity Based on the BIA report(s) the service provider should determine an
strategy development appropriate and cost-effective set of service continuity strategies.

For processes and services with earlier and higher impacts, more
preventive measures should be adopted. For processes and services where
the impact is lower and takes longer to develop, greater emphasis should
be placed on recovery measures.

Service continuity Based on the service continuity policy and strategies, the service provider
plans development should develop and maintain service continuity plans.

Where services or recovery team members have changed, it is essential


that plans are updated. Plans may also be updated following exercise or
actual recovery.

Initial testing of Before publishing, service continuity plans should be tested. The methods
service continuity of initial testing are similar to ongoing exercising.
plans

3.2.4 Testing service continuity plans


This process includes the activities listed in Table 3.7 and transforms the inputs into outputs.

Table 3.7 Inputs, activities, and outputs of the testing service continuity plans process

Key inputs Activities Key outputs

● Performing exercises ● Exercise report(s)


● Awareness and exercise
programme
● Service continuity audit ● Requirements for new and
updated controls
● Service continuity plans
● Request for change of policy
or plans
● Audit reports

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 29
View Only – Not for Redistribution
© 2020

Figure 3.5 shows a workflow diagram of the process.

Figure 3.5 Workflow of the testing service continuity plans process

These activities may be carried out with varying levels of formality by many people in the
organization. Table 3.8 outlines these activities further.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
30 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 3.8 Activities of the testing service continuity plans process

Activity Description

Performing Exercises should be conducted at planned intervals and when significant


exercises changes may impact recovery. The higher the possible impact of a service
outage is, the higher the frequency of exercising should be.
Exercising and testing are not only ways of ensuring readiness; they are also
improvement opportunities. It is generally a good idea to analyse the results of
testing and the overall recovery team performance, then produce exercise
reports that include outcomes and recommendations.
Exercise reports might include requirements for new or updated existing
controls or request for change of service continuity plan.

If exercise is failed, the schedule of following exercises is updated in order to


re-perform the failed exercise as soon as possible.

Service continuity Service continuity audits ensure that BIA, service continuity strategies and plans
audit remain appropriate and relevant as the environment changes. Audits are usually
carried out on a scheduled basis, but may be triggered by failed exercise or
failed recovery.

Audits may be carried out internally, or by third parties. The output of the audit
may identify a need to implement new or updated controls or adjust service
continuity policy or plans.

3.2.5 Response and recovery


This process includes the activities described in Table 3.9 and transforms the inputs into outputs.

Table 3.9 Inputs, activities, and outputs of the response and recovery process

Key inputs Activities Key outputs

● Invocation ● Recovery report(s)


● Service continuity plans
● Executing service continuity ● Requirements for new and
● Incident record(s)
plans updated controls
● Request for change of plans

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 31
View Only – Not for Redistribution
© 2020

Figure 3.6 shows a workflow diagram of the process.

Figure 3.6 Workflow of the response and recovery process.

These activities may be carried out with varying levels of formality by many people in the
organization. Table 3.10 outlines these activities further.

Table 3.10 Activities of the response and recovery process

Activity Description

Invocation Invocation is an act of declaring that an organization’s continuity


arrangements need to be put into effect in order to continue delivering
key products and services 12.

This decision on invocation is typically made by a ‘crisis management’


team (within the strategic level of the organization’s structure 13),
accounting for the:

● Potential impact of the service outage


● Likely duration of the service outage
● Time of day/month/year

12
ISO 22301:2012
13
See 4.2 for details

AXELOS Copyright
View Only – Not for Redistribution
© 2020
32 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Crisis management teams may decide not to invoke service continuity


plans if the risks are low.

In cases of invocation, crisis management teams should also:

● Decide which recovery options the service provider is going to use (if
several options are available)
● Define the scope of the invocation (services, products, sites, locations,
and so on)

Invocation is the ultimate test of service continuity plans. If the


preparatory work has been completed and plans have been developed and
tested, then invocation should be straightforward. If the plans have not
been tested, failures can be expected.

Executing service Once invocation happens, all of the involved recovery teams should
continuity plans perform service continuity procedures. Recovery is likely to be a time of
high activity, involving long hours for many individuals. This must be
recognized and managed by the recovery team coordinators on a tactical
level.

A disruption could occur at any time, so it is essential that guidance on


the invocation process is readily available to key staff in and away from
the office.

The recovery process generally includes the following stages:


● Response: responding to a disruptive event in order to prevent damage,
such as in cases of fire or cyber-attack.
● Recovery: Resuming service delivery according RTO, RPO, and minimum
target service levels.
● Returning to normal operations.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 33
View Only – Not for Redistribution
© 2020

4 Organizations and people


4.1 ROLES, COMPETENCIES, AND RESPONSIBILITIES
The ITIL practice guides do not describe the practice management roles such as practice owner,
practice lead, or practice coach. They focus instead on the specialist roles that are specific to
each practice. The structure and naming of each role may differ from organization to organization,
so any roles defined in ITIL should not be treated as mandatory, or even recommended.
Remember, roles are not job titles. One person can take on multiple roles and one role can be
assigned to multiple people.

Roles are described in the context of processes and activities. Each role is characterized with a
competency profile based on the model shown in Table 4.1.

Table 4.1 Competency codes and profiles

Competency code Competency profile (activities and skills)

L Leader Decision-making, delegating, overseeing other activities, providing


incentives and motivation, and evaluating outcomes

A Administrator Assigning and prioritizing tasks, record-keeping, ongoing


reporting, and initiating basic improvements

C Coordinator/communicator Coordinating multiple parties, maintaining


communication between stakeholders, and running awareness campaigns

M Methods and techniques expert Designing and implementing work


techniques, documenting procedures, consulting on processes, work analysis,
and continual improvement

T Technical expert Providing technical (IT) expertise and conducting


expertise-based assignments

Examples of the roles involved in the service continuity management practice are listed in Table
4.2, together with the associated competency profiles and specific skills.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
34 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Table 4.2 Examples of roles with responsibility for service continuity management activities

Process activity Responsible roles Competency Specific skills


profile

Governance of service continuity management process

Scope definition Steering committee MC Visibility across PESTLE factors


influencing the organization
Policy setting Steering committee MCL Awareness of organization-specific
documentation requirements

Ensuring ongoing engagement of


managers to ensure clarity and the
ongoing realization of service
continuity policies

Awareness and Continuity ACM Knowledge of exercise types and


exercise programme administrator recovery teams’ structures
development Enabling communication channels
Business impact analysis process

VBF identification Service or product CM Business analysis


owner
Good knowledge of the service
consumer’s business
Good knowledge of products,
Relationship manager
including their architecture and
configuration

Service designer

Customer

Analysis of the Service or product MC Good knowledge of the service


consequences of owner consumer’s business
disruption Ability to systematically apply
qualitative and quantitative risk
Relationship manager analysis tools
Professional competencies and
visibility over PESTLE factors that
Customer influence the service

VBF interdependencies Service or product MT Good knowledge of products,


identification owner including their architecture and
configuration

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 35
View Only – Not for Redistribution
© 2020

Service designer

Technical expert

Architecture
management expert

Determination of the Service or product MTC Good understanding of recovery


service continuity owner process
requirements Understanding of service continuity
policy
Continuity
administrator

Developing and maintaining service continuity plans process

Service continuity Continuity TM Good understanding of service


strategies administrator continuity options
development Awareness of existing controls
Awareness of technology available on
Service designer
the market

Technical expert

Service continuity Continuity MTA Excellent documentation skills


plans development administrator

Excellent logical skills


Technical expert

Good understanding of
interdependencies of service
components

Good understanding of technology

Initial testing of Continuity CATL Coordination and communication


service continuity administrator
Excellent knowledge of service
plans
continuity plans

Response and recovery Understanding of technology used as


coordinators and team part of service continuity strategy

AXELOS Copyright
View Only – Not for Redistribution
© 2020
36 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

members

Testing service continuity plans process

Performing exercises Continuity CATL Coordination and communication


administrator
Excellent knowledge of service
continuity plans

Response and recovery Understanding of technology used as


coordinators and team part of service continuity strategy
members

Service continuity Internal or external CAMT Audit management techniques


audit auditors (as mandated
Command of common audit practices
and on behalf of the
board of directors) Assured auditor integrity, objectivity,
and independence

Response and recovery process

Invocation Crisis management LC Excellent understanding of service


group provider’s and consumers’ risks

Understanding of consumer context

Coordination and communication

Executing service Crisis management CATL Coordination and communication


continuity plans group
Excellent knowledge of service
continuity plans

Continuity Understanding of technology used as


administrator part of service continuity strategy

Response and recovery


coordinators and team
members

4.2 ORGANIZATIONAL STRUCTURES AND TEAMS


Disasters are high-impact events, so responses must be very quick; the coordination of response
and recovery activities requires flexibility. Therefore, the business-as-usual structure is not
relevant for disasters.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 37
View Only – Not for Redistribution
© 2020

During the recovery process, the organizational structure is generally based around the levels of
continuity plans. The levels of organizational structure for response and recovery are outlined in
Table 4.3.
Table 4.3 Organizational structure for response and recovery

The level of Organizational level Description


continuity plans

Strategic Executive level This includes senior management/executives, who have overall
authority and control within the organization and who are
responsible for crisis management and liaising with other
departments, divisions, organizations, the media, regulators,
emergency services, and so on.

Tactical Coordination level Typically one level below the executive group, this group is
responsible for coordinating the overall recovery effort within
the organization.

Operational Specialist level A series of service recovery teams that are responsible for
executing plans within their own areas and for liaising with
staff, customers, and third parties. Within IT, recovery teams
should be grouped by services and products.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
38 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

5 Information and technology


5.1 INFORMATION EXCHANGE, INPUTS/OUTPUTS
The effectiveness of the service continuity management practice is based on the quality of the
information used. This information can include:

● consumer’s business processes


● services and their architecture and design
● partners and suppliers and information on the services they provide
● regulatory requirements regarding service continuity
● technology and services available on the market that may be relevant for service continuity
arrangements.
The key inputs and outputs of the practice are listed in section 3.

Service continuity plans are the core of the practice. They should be up to date and available for
all involved parties.

5.2 AUTOMATION AND TOOLING


Especially in large-scale organizations, the service continuity practice should be automated. Where
this is possible and effective, it may involve the solutions outlined in Table 5.1.

Table 5.1 Automation solutions for service continuity management activities

Process activity Means of automation Key functionality Impact on the


effectiveness of
the practice

Governance of service continuity management process

Scope definition Knowledge Service continuity policies, Low


management tools and including the scope of the
Policy setting document repositories programme, guidelines, and
roles and responsibilities,
need to be easily accessible
by the service provider staff,
regulators, and external
stakeholders, such as
customer representatives

Awareness and Business continuity Service continuity Medium


exercise programme planning tools administrators, service
development owners, and recovery team
members should have access
to the exercise schedule and
information about the scope
of the exercise in which they

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 39
View Only – Not for Redistribution
© 2020

Process activity Means of automation Key functionality Impact on the


effectiveness of
the practice

are involved

Business impact analysis process

VBF identification Service catalogue, To identify VBFs, the service High


CMDB, BPM tools analyst should have access to
information about the service
components and actions. BPM
tools may provide information
about the consumer’s
processes and operations
supported by the service

Analysis of the Business continuity Analysis can be underpinned High


consequences of planning tools by a variety of management
disruption systems data, such as
analytical tools,
incident reports and
risk assessment tools, information about realized
incident management risks. Analysts may also use
tools modelling tools to forecast
expected losses in case of
service or specific VBF
outages.

VBF Business continuity Analysts may use service and High


interdependencies planning tools, CMDB, configuration models to
identification analytical tools identify key service and VBF
interdependencies.

Determination of Business continuity Continuity administrator, Low


the service planning tools, service service owners and recovery
continuity catalogue team members should have
requirements access to service continuity
requirements.

Developing and maintaining service continuity plans process

Service continuity Business continuity Determining existing controls Medium


strategies planning tools, CMDB, and resilience measures

AXELOS Copyright
View Only – Not for Redistribution
© 2020
40 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

Process activity Means of automation Key functionality Impact on the


effectiveness of
the practice

development change initiation and


control tools
Initiating changes that should
be implemented as a part of
service continuity strategy
realization

Service continuity Business continuity Control of expiry dates, Low to high,


plans development planning tools, version control, and archiving depending on the
document control tools of documents volume of
documents to
manage

Initial testing of
service continuity See ‘performing exercises’
plans

Testing service continuity plans process

Performing Conferencing tools, All involved parties should be High


exercises monitoring tools, able to communicate and
technology collaborate, have ongoing
management, and understanding of current
system administration situation and manage service
tools components in order to
execute service continuity
plans.

Service continuity Knowledge The auditors should have Medium


audit management tools and access to the service
document repositories continuity documentation,
including plans, exercise
programmes, exercise
reports, and recovery
reports.

Response and recovery process

Invocation Monitoring tools, Crisis management group High


must be able to get
emergency

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 41
View Only – Not for Redistribution
© 2020

Process activity Means of automation Key functionality Impact on the


effectiveness of
the practice

notification, information about event and


instantly direct response and
conferencing tools,
recovery process.
incident management
tools

Executing service Conferencing tools, All involved parties should be High


continuity plans able to communicate and
emergency
collaborate, have an ongoing
management tools,
understanding of the current
monitoring tools, situation, and manage service
components in order to
technology
execute service continuity
management and
plans
system administration
tools,

incident management
tools

AXELOS Copyright
View Only – Not for Redistribution
© 2020
42 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

6 Partners and suppliers


Very few services are delivered using only an organization’s own resources. Most, if not all, depend
on other services, often provided by third parties outside the organization (see section 2.4 of ITIL®
Foundation: ITIL 4 Edition for a model of a service relationship). Relationships and dependencies
introduced by supporting services are described in the practice guides for service design,
architecture management, and supplier management.

Partners and suppliers may provide critical products and service components. The service provider
needs to negotiate and agree service continuity requirements with partners and suppliers in order
to meet service continuity requirements.

Partners and suppliers may also provide continuity services and solutions, such as backup site, on-
demand computing, disaster recovery as a service, and so on. In these cases, they should also be
involved in service continuity plan development, testing, and execution.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
AXELOS Copyright Service continuity management 43
View Only – Not for Redistribution
© 2020

7 Important reminder
Most of the content of the practice guides should be taken as a suggestion of areas that an
organization might consider when establishing and nurturing their own practices. The practice
guides are catalogues of topics that organizations might think about, not a list of answers. When
using the content of the practice guides, organizations should always follow the ITIL guiding
principles:

● focus on value
● start where you are
● progress iteratively with feedback
● collaborate and promote visibility
● think and work holistically
● keep it simple and practical
● optimize and automate.
More information on the guiding principles and their application can be found in section 4.3 of
ITIL® Foundation: ITIL 4 Edition.

AXELOS Copyright
View Only – Not for Redistribution
© 2020
44 Service continuity management AXELOS Copyright
View Only – Not for Redistribution ©
2020

8 Acknowledgments
AXELOS Ltd is grateful to everyone who has contributed to the development of this guidance.
These practice guides incorporate an unprecedented level of enthusiasm and feedback from across
the ITIL community. In particular, AXELOS would like to thank the following people.

8.1 AUTHORS
Pavel Demin

8.2 REVIEWERS
Dinara Adyrbai, Roman Jouravlev

AXELOS Copyright
View Only – Not for Redistribution
© 2020

You might also like