Smart Network and Security Operations Centre
Smart Network and Security Operations Centre
Smart Network and Security Operations Centre
ABSTRACT
This article shares the rationale and benefits of combining the conventional Network Operations Centre (NOC) and Security
Operations Centre (SOC) into an integrated Network and Security Operations Centre (NSOC). By re-engineering operational
processes and augmenting them with technologies such as end-to-end IT visualisation and analytics, NSOC provides IT
managers and operators with end-to-end situational awareness and a streamlined incident management process.
With NSOC, IT incidents can now be managed more holistically and efficiently and this helps in reducing service recovery
lead time and minimising additional head count while increasing the operational availability of IT systems.
Keywords: NOC, SOC, NSOC, Network and Security Operations Centre, Converge
INTRODUCTION and data centre facilities. Through the use of these tools,
NOC operators are able to perform fault management and
The Singapore Armed Forces (SAF) operations of today are coordination of service recovery efforts.
becoming more complex due to increasing network-centric
operations, operations-other-than war and cyber threats. There On the other hand, the emergence of increasingly advanced
is a need to enhance the monitoring of IT systems performed cyber threats has created a new dimension of challenge that
by the SAF’s existing Network Operations Centre (NOC) and goes beyond the capabilities of NOC management tools.
Security Operations Centre (SOC) as well as streamline the SOCs were hence established with specialised tools to provide
incident management process so that IT incidents can be quickly capabilities such as security information and event management
detected and efficiently managed. This would enable services as well as malware analysis that enables cybersecurity analysts
to be restored promptly and increase IT system resiliency. to focus on the deep investigative and forensic work required
to accurately detect and respond to cybersecurity incidents.
BACKGROUND OF NOC AND SOC Working in tandem, NOCs and SOCs ensure the availability
and integrity of IT systems, functioning similar to a human’s
NOC and SOC are synonymous with the smooth and secure
central nervous and immune systems that detect and respond
running of today’s IT landscape. They are critical IT nerve
to infections.
centres of public and private enterprises throughout the world.
Historically, NOCs and SOCs functioned as separate entities
fulfilling different missions. SYNERGIES BETWEEN NOC AND
SOC
NOCs play a pivotal role in infrastructure availability and are
often measured by uptime Service Level Agreements (SLA). As IT infrastructures grow in size and complexity to meet users’
NOC operators utilise a range of management tools to actively increasing operational needs, NOCs and SOCs will need to work
monitor and manage the performance and status of various closely together to provide a holistic infrastructure and security
IT infrastructure equipment such as routers and switches, with view of the IT system. This will enable better sensemaking and
increasing expansion of scope to include servers, storage situational awareness which will allow the NOC and SOC to
With the maturing of network system management technology, At the same time, by analysing the network inventory and
many infrastructure equipment today are leveraging common configuration data, the NSOC will be able to automatically
standards such as SNMP2, SYSLOG3, REST4, JSON5 alert the operator on potential security vulnerabilities in the
and XML6 to communicate with the management tools. infrastructure and locate components that are not compliant
This standardisation enables the enterprise NMS to easily with organisational security profiles.
communicate with the disparate infrastructure and security
management tools to understand the information being Data and Software Integration
presented. For legacy systems that are still using proprietary
communication methods, enterprise NMS now comes with a These new capabilities form the bedrock of an NSOC’s
number of predefined equipment adapters and this makes it operation by fusing together the various NOC and SOC
easy to reach out to these legacy systems without needing to tools and providing the NSOC operator with a holistic
self-customise. end-to-end view of the interdependencies between the
various infrastructure equipment as well as security
Data, Cybersecurity and Infrastructure incidents. This timely situational awareness facilitates
Analytics greater and faster accuracy in service impact analysis. This
is important in assessing the actual health and performance
Traditionally, a major challenge in enabling end-to-end of the application and corresponding service recovery
situational awareness is the inability to map out the relationship prioritisation.
between various infrastructure equipment and their
performance statistics and trends. The typical approach The integration of technology also maximises cost
is to manually define relationship rules to link the various effectiveness in building and maintaining the underlying
equipment together as well as manually inspect and correlate management infrastructure as well as pave the way for
statistics. This approach is laborious, prone to human error and refining incidents and problem management processes.
unsustainable.
Processes
Analytics capabilities found in today’s enterprise NMS are able
to form the overall infrastructure topology and dependencies The establishment of an integrated NSOC facilitates the ease
automatically from information obtained from the various of information sharing and enables close collaboration between
infrastructure equipment to provide the operator with a real- the previously separate NOC and SOC teams.
Disparate processes can now be streamlined and better The integration, streamlining and automation enabled by
automated. For example, instead of manning two separate NSOC makes it easier for operators to perform their jobs and
incident response hotlines with two different teams performing focus on incident management and service recovery tasks as
their own work, one single hotline that handles both NOC and less system training and maintenance is required.
SOC incidents can be created as illustrated in Figure 1. The
hotline operator can perform first level diagnosis using the People
integrated NSOC tools to identify if there is an infrastructure
fault or cyber incident and route the incident to the respective The availability and sustainability of suitably trained operators
service recovery teams. If the cause of the incident is not is an increasing concern in today’s manpower landscape as the
straightforward, it can be escalated to second level NOC skilled engineering pool is decreasing over the years. This is an
and SOC specialists to perform more in-depth investigation. issue that needs to be systematically addressed.
Commonly occurring incidents can also be automatically
identified and routed to the service recovery teams without the Beyond the technological enhancements, process streamlining
need for operator involvement. and automation in NSOC, opportunities are created to optimise
manpower headcount and at the same time make operators
Centralising Case Management feel more engaged with higher value tasks.
All the incidents are tracked via a common case management For example, NOC operators are experienced in servers, desktop
system that automatically monitors the progress status and and network support and will have good troubleshooting skills
flags out cases for escalation if service recovery will breach the and TCP/IP protocol suite knowledge. The same set of skills
established service level agreement. The case management are also necessary for SOC tasks. Hence, instead of engaging
system also reconciles the incident resolutions into a common two persons to perform overlapping tasks, better synergy can
knowledge base that operators can refer to when incidents of be achieved by cross training the staff such that he or she can
similar natures occur, hence further improving triage accuracy perform first level tasks for both the NOC and SOC. In this way,
and reducing service recovery lead time. it gives a more holistic meaning to the staff’s job while at the
same time allowing for the creation of a leaner team.
Detecting Infrastructure Anomalies and NSOC managers and operators to perform swifter and more
Forecasting Capacity Growth informed decision making while reducing human error. At the
same time, it identifies regularly recurring incidents and prompts
By performing analytics on historical performance trends, the the NSOC operator for further investigation and escalation so
Service Management Layer flags out infrastructure behaviour that the actual root cause can be identified, hence increasing
deviations for further investigation into potential cybersecurity future system availability.
threats or infrastructure faults. This information is further
extrapolated to estimate infrastructure growth requirements and CHALLENGES
prompts the NSOC manager to make necessary adjustments
before degradation occurs. The consolidation of conventionally separate and independent
NOC and SOC into a common NSOC enables incidents to be
Assessing Infrastructure Vulnerability and addressed more holistically and efficiently to increase system
Compliance resiliency and maintain operational effectiveness. However,
there are three inherent challenges that need to be addressed
The Service Management Layer assists the NSOC operators in order for NSOC to materialise.
in auditing and ensuring alignment with organisation policies
as well as identifying potential security issues and bugs. Foremost on the list are ownership and skillset issues. Typically,
This is done by performing active network scanning of the NOCs and SOCs each have their own system owners. When
infrastructure equipment to determine if it is exposing known merged, there is a need to iron out issues such as who will be
vulnerabilities and automatically analysing infrastructure the owner and final decision maker for the NSOC. For example,
inventory and configuration information as well as if vulnerable a NOC operator may interpret a device outage event as an
software components are installed. indicator of equipment failure while a SOC analyst may interpret
that same event as a compromised equipment indicator. At
Facilitating Incident and Problem the same time, beyond the fundamental infrastructure and
Management system technical skills, SOC skillsets are investigative in nature
while NOC skillsets are more focused on troubleshooting and
A case management system is provided within the Service recovery. The NOC and SOC staff will need to cross train and
Management Layer to centrally track all IT infrastructure adjust their mind sets and mental models. They will also need
incidents. This system comes with workflow automation, SLA to expand their range of skills more rapidly and react faster to
monitoring and knowledge management functions that enable the increased number of technologies involved.