Smart Network and Security Operations Centre

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

SMART NETWORK AND

SECURITY OPERATIONS CENTRE


TAN Shyh Hae, LEE Kok Thong, SEOW Nyi Matthew, TAN Choon How

ABSTRACT
This article shares the rationale and benefits of combining the conventional Network Operations Centre (NOC) and Security
Operations Centre (SOC) into an integrated Network and Security Operations Centre (NSOC). By re-engineering operational
processes and augmenting them with technologies such as end-to-end IT visualisation and analytics, NSOC provides IT
managers and operators with end-to-end situational awareness and a streamlined incident management process.

With NSOC, IT incidents can now be managed more holistically and efficiently and this helps in reducing service recovery
lead time and minimising additional head count while increasing the operational availability of IT systems.

Keywords: NOC, SOC, NSOC, Network and Security Operations Centre, Converge

INTRODUCTION and data centre facilities. Through the use of these tools,
NOC operators are able to perform fault management and
The Singapore Armed Forces (SAF) operations of today are coordination of service recovery efforts.
becoming more complex due to increasing network-centric
operations, operations-other-than war and cyber threats. There On the other hand, the emergence of increasingly advanced
is a need to enhance the monitoring of IT systems performed cyber threats has created a new dimension of challenge that
by the SAF’s existing Network Operations Centre (NOC) and goes beyond the capabilities of NOC management tools.
Security Operations Centre (SOC) as well as streamline the SOCs were hence established with specialised tools to provide
incident management process so that IT incidents can be quickly capabilities such as security information and event management
detected and efficiently managed. This would enable services as well as malware analysis that enables cybersecurity analysts
to be restored promptly and increase IT system resiliency. to focus on the deep investigative and forensic work required
to accurately detect and respond to cybersecurity incidents.

BACKGROUND OF NOC AND SOC Working in tandem, NOCs and SOCs ensure the availability
and integrity of IT systems, functioning similar to a human’s
NOC and SOC are synonymous with the smooth and secure
central nervous and immune systems that detect and respond
running of today’s IT landscape. They are critical IT nerve
to infections.
centres of public and private enterprises throughout the world.
Historically, NOCs and SOCs functioned as separate entities
fulfilling different missions. SYNERGIES BETWEEN NOC AND
SOC
NOCs play a pivotal role in infrastructure availability and are
often measured by uptime Service Level Agreements (SLA). As IT infrastructures grow in size and complexity to meet users’
NOC operators utilise a range of management tools to actively increasing operational needs, NOCs and SOCs will need to work
monitor and manage the performance and status of various closely together to provide a holistic infrastructure and security
IT infrastructure equipment such as routers and switches, with view of the IT system. This will enable better sensemaking and
increasing expansion of scope to include servers, storage situational awareness which will allow the NOC and SOC to

24 DSTA HORIZONS | 2016


remain effective in addressing monitoring and service recovery monitoring system cross-referencing infrastructure faults and
challenges amid infrastructure growth and complexity. behaviour anomaly with cyber incidents as well as trending and
insights from analytic tools, operators will better understand
Fusion of Situational Views across NOC the extent of the cyber threat being analysed. They can also
and SOC determine the indicators or signatures to look out for and easily
correlate seemingly unrelated events. This can help to provide
A NOC will need to be enhanced with manager-of-managers1 greater insights into low signature security events which may
(MoM) capabilities to fuse together information from the various normally be ignored by cybersecurity analysts focusing on
NOC and SOC tools and provide a holistic infrastructure higher volume and higher signature events.
and security view. This view will provide the operator with
timely situational awareness on the interdependencies CONVERGENCE OF NETWORK AND
between various infrastructures and security equipment and
SECURITY OPERATIONS CENTRES
facilitate more accurate impact analysis and service recovery
prioritisation. For example, the MoM can automatically pinpoint
NOCs and SOCs generally have similar operational structures
the potential root cause of an infrastructure outage to an open
with both using tiered monitoring and incident response teams.
door that caused the room temperature to rise and equipment
Junior operators usually form Tier 1 and are responsible for
to overheat, instead of the traditional approach of overwhelming
work orders, system monitoring, call handling, preliminary
the operator with separate door sensor, temperature and
investigation and triage of detected and reported events.
device failure alarms. In a heavily virtualised cloud environment,
Events that are unable to be triaged are escalated to senior
the MoM can also automatically determine the potential root
Tier 2 specialists for more detailed review and resolution. Tier
cause of slow application processing instead of the traditional
3 subject matter experts serve as the final escalation point for
display of numerous independent performance counters for the
the most complex of issues.
operator to self-correlate.

In addition, there are commonalities in NOC and SOC


infrastructures and operations. NOCs and SOCs both require
Addressing Overlapping Infrastructure analyst workstations, call routing and management systems
Faults and Cyber Threats and facilities, service level agreements, standard operating
procedures, workflow and trouble ticketing.
The line between infrastructure faults and cyber threats is
becoming increasingly blurred as more powerful and deceptive To enable NOCs and SOCs to work closely together for better
cyber attacks tend to autonomously jump between different sensemaking and situational awareness as well as remain
infrastructure equipment in order to cover their tracks and effective in addressing next-generation infrastructure monitoring
launch their attacks. For example, the infamous Stuxnet and service recovery challenges, an innovative approach is to
computer worm entered a closed network via a USB infection combine the conventionally separate and independent NOC
and exploited the Siemens Step-7 programmable logic and SOC into a common Network and Security Operations
controller software application to cause the Iranian centrifuges Centre (NSOC). This is achieved by consolidating operations
to overspin and become damaged and unserviceable. and re-engineering processes. This integrated approach is
To be able to more effectively detect such threats, one also currently being explored by companies such as American
possibility is to have NOC and SOC collaborate, cross- Systems, General Dynamics, HP and IBM.
correlate and potentially identify the common patterns from
their respective tools instead of the traditional approach of Moreover, this approach helps to save on valuable data centre
looking at infrastructure faults and security events in silo. real estate and corresponding power and cooling facilities
These anomalous patterns can then be further investigated as NOC and SOC components require significant resources
by specialists to diagnose and pinpoint the nature of the to run. It also helps to minimise the monitoring load placed
infrastructure incidents more accurately. on the infrastructure equipment as one common information
aggregator can collect all the data required and then share it
Enriching Security Insights with NOC and SOC tools instead of each operations centre
collecting data separately. In addition, a common NSOC
Information gathered by the NOC will be able to enrich the will have the integrated processes and structures in place
SOC investigative and forensics work. With an end-to-end to allow NOC and SOC operators to communicate and

DSTA HORIZONS | 2016 25


coordinate seamlessly as well as tap each other’s skillsets time automated end-to-end holistic view of infrastructure
and experiences to identify, manage and resolve incidents performance statistics and trends.
effectively.
For example, in a virtualised cloud environment, the enterprise
Technology as a Key Enabler for NSOC NMS is able to automatically show on which physical host a
virtual machine is running, as well as the underlying network
Traditional Network Management Systems (NMS) have and storage connections, without needing manual human
difficulties performing the role of a MoM as the majority of them intervention. This topology awareness simplifies the need to
do not have out-of-the-box equipment adapters to correctly perform manual rules creation and maintenance significantly
interpret information from the different brands and models of while providing the operator with a real-time and up-to-
infrastructure equipment in use, along with their corresponding date, end-to-end topology view that facilitates situational
management tools. Those that have the equipment adapters awareness. This capability enables the operator to pinpoint the
are handicapped by the need to manually create rules to map various bottlenecks in the infrastructure quickly and make the
out the interdependencies between components which creates necessary adjustments, potentially before actual degradation
sustainability and scalability issues. occurs. To investigate the cause of slow application
processing, the operator will no longer need to manually look
However, the reference architecture behind NSOC is now at the current and historical performance statistics for all the
achievable with the advancement of enterprise network supporting infrastructure equipment and attempt to correlate
management technologies. and identify a pattern. This holistic topology view also enables
automated end-to-end root cause analysis that speeds up the
Standardisation and Compatibility identification of the actual cause of a fault.

With the maturing of network system management technology, At the same time, by analysing the network inventory and
many infrastructure equipment today are leveraging common configuration data, the NSOC will be able to automatically
standards such as SNMP2, SYSLOG3, REST4, JSON5 alert the operator on potential security vulnerabilities in the
and XML6 to communicate with the management tools. infrastructure and locate components that are not compliant
This standardisation enables the enterprise NMS to easily with organisational security profiles.
communicate with the disparate infrastructure and security
management tools to understand the information being Data and Software Integration
presented. For legacy systems that are still using proprietary
communication methods, enterprise NMS now comes with a These new capabilities form the bedrock of an NSOC’s
number of predefined equipment adapters and this makes it operation by fusing together the various NOC and SOC
easy to reach out to these legacy systems without needing to tools and providing the NSOC operator with a holistic
self-customise. end-to-end view of the interdependencies between the
various infrastructure equipment as well as security
Data, Cybersecurity and Infrastructure incidents. This timely situational awareness facilitates
Analytics greater and faster accuracy in service impact analysis. This
is important in assessing the actual health and performance
Traditionally, a major challenge in enabling end-to-end of the application and corresponding service recovery
situational awareness is the inability to map out the relationship prioritisation.
between various infrastructure equipment and their
performance statistics and trends. The typical approach The integration of technology also maximises cost
is to manually define relationship rules to link the various effectiveness in building and maintaining the underlying
equipment together as well as manually inspect and correlate management infrastructure as well as pave the way for
statistics. This approach is laborious, prone to human error and refining incidents and problem management processes.
unsustainable.
Processes
Analytics capabilities found in today’s enterprise NMS are able
to form the overall infrastructure topology and dependencies The establishment of an integrated NSOC facilitates the ease
automatically from information obtained from the various of information sharing and enables close collaboration between
infrastructure equipment to provide the operator with a real- the previously separate NOC and SOC teams.

26 DSTA HORIZONS | 2016


SMART NETWORK AND SECURITY OPERATIONS CENTRE

Streamlining and Automation Design for Support

Disparate processes can now be streamlined and better The integration, streamlining and automation enabled by
automated. For example, instead of manning two separate NSOC makes it easier for operators to perform their jobs and
incident response hotlines with two different teams performing focus on incident management and service recovery tasks as
their own work, one single hotline that handles both NOC and less system training and maintenance is required.
SOC incidents can be created as illustrated in Figure 1. The
hotline operator can perform first level diagnosis using the People
integrated NSOC tools to identify if there is an infrastructure
fault or cyber incident and route the incident to the respective The availability and sustainability of suitably trained operators
service recovery teams. If the cause of the incident is not is an increasing concern in today’s manpower landscape as the
straightforward, it can be escalated to second level NOC skilled engineering pool is decreasing over the years. This is an
and SOC specialists to perform more in-depth investigation. issue that needs to be systematically addressed.
Commonly occurring incidents can also be automatically
identified and routed to the service recovery teams without the Beyond the technological enhancements, process streamlining
need for operator involvement. and automation in NSOC, opportunities are created to optimise
manpower headcount and at the same time make operators
Centralising Case Management feel more engaged with higher value tasks.

All the incidents are tracked via a common case management For example, NOC operators are experienced in servers, desktop
system that automatically monitors the progress status and and network support and will have good troubleshooting skills
flags out cases for escalation if service recovery will breach the and TCP/IP protocol suite knowledge. The same set of skills
established service level agreement. The case management are also necessary for SOC tasks. Hence, instead of engaging
system also reconciles the incident resolutions into a common two persons to perform overlapping tasks, better synergy can
knowledge base that operators can refer to when incidents of be achieved by cross training the staff such that he or she can
similar natures occur, hence further improving triage accuracy perform first level tasks for both the NOC and SOC. In this way,
and reducing service recovery lead time. it gives a more holistic meaning to the staff’s job while at the
same time allowing for the creation of a leaner team.

Figure 1. Streamlining of incident management processes

DSTA HORIZONS | 2016 27


REFERENCE AND SYSTEM Service Management Layer
ARCHITECTURE FOR NSOC
The capabilities in this layer form the ‘brain’ of the NSOC. It
provides the holistic situational picture and decision support
The reference and system architectures for NSOC consist of
functions for NSOC managers and operators to assess
three layers (see Figure 2 and 3).
the operational impact of the IT infrastructure incident and
perform the required recovery actions. The Logs and Events
Data Source Layer
Consolidator aggregates and indexes information from the
System Management and Data Source Layers into a centralised
This comprises the infrastructure equipment in use. Raw
data warehouse for the various Service Management Layer
instrumentation data such as performance counters, health
tools to perform searching, analysing and reporting tasks.
status and logs are used by the respective NMS in the System
Management Layer for individual monitoring and management
Performing Service Impact Analysis
purposes. These data are also piped into the Service
Management Layer via the Logs and Events Consolidator for
The Service Management Layer reconciles related historical
further processing.
and current events from various infrastructure equipment
to identify and advise the NSOC operator on the potential
System Management Layer
root causes of the IT infrastructure incident, the equipment
involved and the sequence of events and activities leading to
NMS performing the FCAPS7 monitoring and management for
the incident. It also assesses the actual impact of the incident
the IT infrastructure are grouped under this layer. Alarms, events
on the IT infrastructure availability by factoring in infrastructure
and statistics from these systems are fed into the Service
redundancy and criticality parameters. Furthermore, it
Management Layer via the Logs and Events Consolidator for
provides the NSOC operator with recommendations on the
further processing.
corresponding service recovery prioritisation.

Figure 2. NSOC reference architecture

28 DSTA HORIZONS | 2016


SMART NETWORK AND SECURITY OPERATIONS CENTRE

Detecting Infrastructure Anomalies and NSOC managers and operators to perform swifter and more
Forecasting Capacity Growth informed decision making while reducing human error. At the
same time, it identifies regularly recurring incidents and prompts
By performing analytics on historical performance trends, the the NSOC operator for further investigation and escalation so
Service Management Layer flags out infrastructure behaviour that the actual root cause can be identified, hence increasing
deviations for further investigation into potential cybersecurity future system availability.
threats or infrastructure faults. This information is further
extrapolated to estimate infrastructure growth requirements and CHALLENGES
prompts the NSOC manager to make necessary adjustments
before degradation occurs. The consolidation of conventionally separate and independent
NOC and SOC into a common NSOC enables incidents to be
Assessing Infrastructure Vulnerability and addressed more holistically and efficiently to increase system
Compliance resiliency and maintain operational effectiveness. However,
there are three inherent challenges that need to be addressed
The Service Management Layer assists the NSOC operators in order for NSOC to materialise.
in auditing and ensuring alignment with organisation policies
as well as identifying potential security issues and bugs. Foremost on the list are ownership and skillset issues. Typically,
This is done by performing active network scanning of the NOCs and SOCs each have their own system owners. When
infrastructure equipment to determine if it is exposing known merged, there is a need to iron out issues such as who will be
vulnerabilities and automatically analysing infrastructure the owner and final decision maker for the NSOC. For example,
inventory and configuration information as well as if vulnerable a NOC operator may interpret a device outage event as an
software components are installed. indicator of equipment failure while a SOC analyst may interpret
that same event as a compromised equipment indicator. At
Facilitating Incident and Problem the same time, beyond the fundamental infrastructure and
Management system technical skills, SOC skillsets are investigative in nature
while NOC skillsets are more focused on troubleshooting and
A case management system is provided within the Service recovery. The NOC and SOC staff will need to cross train and
Management Layer to centrally track all IT infrastructure adjust their mind sets and mental models. They will also need
incidents. This system comes with workflow automation, SLA to expand their range of skills more rapidly and react faster to
monitoring and knowledge management functions that enable the increased number of technologies involved.

Figure 3. NSOC system architecture

DSTA HORIZONS | 2016 29


Next would be the need to balance between NOC and SOC Goodchild, J. (2009, November). Network and security
operations. The NOC’s objective is to perform service recovery operations convergence. Retrieved from http://www.
and enhance resiliency rapidly while the SOC’s objective is to networkworld.com/article/2237963/compliance/network-and-
investigate and create countermeasures. This tension between security-operations-convergence.html
a NOC needing faster recovery and SOC activities resulting in
slower recovery will make it difficult for the NSOC operators Imbert, C. (2015). Knitting SOCs: Designing and developing
to make a sound judgement on which approach to take in the the staff of a security operations center. Retrieved from https://
event of an unknown incident. www.sans.org/reading-room/whitepapers/incident/knitting-
socs-35975
Lastly, when the NOC and SOC components are built around
each other in an integrated NSOC, the components become Jenkins, D. (n.d.). Secure your operations through NOC/SOC
intrinsically intertwined. As the NSOC system scales with integration. [Powerpoint slides]. Retrieved from http://uk.idc.
more components, the compatibility and interoperability dead com/downloads/events/sec06_jenkins.pdf
weight increases and may affect overall system performance
as well as the ability to rapidly add new components to address JSON. (n.d.). In Wikipedia. Retrieved July 23, 2015, from
emerging concerns. https://en.wikipedia.org/wiki/JSON

CONCLUSION Meierdirk, A. (2012). Best practices for developing and


implementing the right monitoring framework: Next-generation
While differences exist between NOCs and SOCs, the network operations center. [Powerpoint slides]. Retrieved from
convergence of both centres can be both practical and http://www.remotemagazine.com/conferences/wp-content/
beneficial, combining the awareness and control of an uploads/2012/09/INOC.pdf
enterprise’s nervous system (i.e. the NOC) with the defence
and response of its immune system (i.e. the SOC). Metzler, J. (2008). The next generation network operations
center: How the focus on application delivery is redefining
Efficiency gains can be realised through the introduction of a the NOC. Retrieved from http://www.webtorials.com/main/
single and integrated point-of-contact for all IT infrastructure resource/papers/NetQoS/paper13/NextGenerationNOC.pdf
and security events. Service levels and system resiliency can
also benefit through improved communication and increased REST. (n.d.). In Wikipedia. Retrieved July 23, 2015, from https://
situational awareness. Incident response time is reduced en.wikipedia.org/wiki/Representational_state_transfer
as a unified operations centre owns both the capability and
responsibility for enacting mitigating measures. Simple network management protocol. (n.d.). In Wikipedia.
Retrieved July 23, 2015, from https://en.wikipedia.org/wiki/
Simple_Network_Management_Protocol
REFERENCES
Syslog. (n.d.). In Wikipedia. Retrieved July 23, 2015, from
Babu Veerappa Srinivas. (2014). Security operations centre
https://en.wikipedia.org/wiki/Syslog
(SOC) in a utility organization. Retrieved from http://www.giac.
org/paper/gslc/8336/security-operations-centre-soc-utility-
Walker, M. (2009). Manager of managers architectures:
organization/138736
Providing enterprise situational awareness to the user.
[Powerpoint slides]. Retrieved from http://sunset.usc.edu/
Ennis, S. (2009). A phased approach for building a next-
GSAW/gsaw2009/s5/walker.pdf
generation network operations center: A planning guide.
Retrieved from http://www.eirteic.com/wp-content/uploads/
XML. (n.d.). In Wikipedia. Retrieved July 23, 2015, from https://
2013/11/whitepapers_phased-approach-for-building-a-next-
en.wikipedia.org/wiki/XML
generation-network-operations-center.pdf

EY. (2013). Security operations centers against cybercrime:


Top 10 considerations for success. Retrieved from http://www.
ey.com/Publication/vwLUAssets/EY_Security_Operations_
Centers_against_cybercrime/$FILE/EY-SOC-Oct-2013.pdf

30 DSTA HORIZONS | 2016


SMART NETWORK AND SECURITY OPERATIONS CENTRE

ENDNOTES LEE Kok Thong is concurrently Head


Command, Control and Communications
1 (C3) and Ops Systems (Networked Systems)
The user interface provides a single, integrated view of the
as well as Director (C3 and Ops Systems)
system and network, thereby allowing management of existing
of the Ops-Tech Group in the Ministry of
and distributed network managers from one interface.
Home Affairs (MHA). Kok Thong is primarily
2 Simple responsible for providing engineering support
Network Management Protocol (SNMP) is an
to MHA and the Home Team departments.
Internet-standard protocol for managing devices on IP
He was previously Head Capability Development (Operations
networks.
Infrastructure) in the InfoComm Infrastructure Programme
3 Centre, where he was in charge of developing the SAF Integrated
SYSLOG is a widely used standard for message logging.
Knowledge-based Command and Control IT infrastructure. Kok
4 Representational Thong also served as Head of DSTA’s Defence Technology Office
State Transfer (REST) is the software
(Europe) to manage defence technology relations with overseas
architectural style of the World Wide Web.
partners. A recipient of the Public Service Commission Scholarship,
5 Kok Thong graduated with a direct Master of Engineering (Electrical
JavaScript Object Notation (JSON) is an open standard
Engineering) degree from Ecole Foundation EPF, France, in 1997.
format that uses human-readable text to transmit data objects
He further obtained a Master of Science (Telecommunications)
consisting of attribute–value pairs.
degree with Distinction from King’s College London, UK, in 1997
6 as part of the European Schools Exchange Programme. Under the
Extensible Markup Language (XML) is a markup language
DSTA Postgraduate Scholarship, Kok Thong also graduated with a
that defines a set of rules for encoding documents in a format
concurrent Master of Science (Defence Technology and Systems)
which is both human-readable and machine-readable.
degree from Temasek Defence Systems Institute, NUS, as well as a
7 FCAPS Master of Science (Information Assurance) degree with Distinction
is the ISO Telecommunications Management
from the Naval Postgraduate School, USA, in 2003.
Network model and framework for network management.
FCAPS is an acronym for the fault, configuration, accounting,
SEOW Nyi Matthew is Head Transmission
performance and security management categories into which
and Core Network (InfoComm Infrastructure)
the ISO model defines network management tasks.
who oversees the design, acquisition,
implementation and system management
BIOGRAPHY of wide-area critical infrastructures for
MINDEF and the SAF. He also contributed
TAN Shyh Hae is a Manager (InfoComm
to the Infocomm Development Authority
Infrastructure) working on the requirements
of Singapore’s Service-wide Technical Architecture between
development and architecting of the next-
2007 and 2011. A recipient of the Public Service Commission
generation Network and Security Operations
Scholarship, Matthew graduated with a Bachelor of Engineering
Centre (NSOC) concept for the Ministry
(Electrical and Electronic Engineering) degree with Honours from
of Defence (MINDEF) and the Singapore
Nanyang Technological University in 1995.
Armed Forces (SAF). He also led project
teams that pioneered the implementation
TAN Choon How is a Senior Engineer
of mobile messaging and secure removable storage solutions for
(InfoComm Infrastructure) who is involved
MINDEF and the SAF. Under the DSTA Undergraduate Scholarship,
in conceptualising the next-generation
Shyh Hae graduated with a Bachelor of Engineering (Computer
NSOC. He was also part of the team that
Engineering) degree from the National University of Singapore
spearheaded the design and implementation
(NUS) in 2006. He further obtained a Master of Science
of the Network Monitoring and Diagnostics
(Computing and Security) degree from King’s College London,
System used by the SAF Network Operations
UK, in 2011 under the DSTA Postgraduate Scholarship.
Centre. Choon How graduated with a Bachelor of Technology
(Electronics Engineering) degree with Honours from NUS in 2004.

DSTA HORIZONS | 2016 31

You might also like