Itc16 PLC

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Machine Learning-based Defense Against Process-

Aware Attacks on Industrial Control Systems


Anastasis Keliris∗ , Hossein Salehghaffari∗ , Brian Cairl∗ ,
Prashanth Krishnamurthy∗ , Michail Maniatakos† and Farshad Khorrami∗

Tandon School of Engineering, New York University, New York, USA
† New York University Abu Dhabi, Abu Dhabi, UAE

Email: {anastasis.keliris, h.saleh, bc1152, prashanth.krishnamurthy, michail.maniatakos, khorrami}@nyu.edu

Abstract—The modernization of Industrial Control Systems general purpose Commercial-Off-The-Shelf (COTS) hardware
(ICS), primarily targeting increased efficiency and controllability and software [2]. A contemporary ICS typically incorporates
through integration of Information Technologies (IT), introduced microcontrollers and common-architecture embedded micro-
the unwanted side effect of extending the ICS cyber-security
threat landscape. ICS are facing new security challenges and processors (e.g., ARM-based) running commodity operating
are exposed to the same vulnerabilities that plague IT, as systems, such as Windriver’s VXworks, MentorGraphics’ Nu-
demonstrated by the increasing number of incidents targeting cleus and Unix-based Real Time Operating Systems (RTOS).
ICS. Due to the criticality and unique nature of these systems, it Other advanced features include web servers with graphical
is important to devise novel defense mechanisms that incorporate user interfaces for configuration and monitoring, File Transfer
knowledge of the underlying physical model, and can detect
attacks in early phases. To this end, we study a benchmark Protocol (FTP) servers, common networking standards, and
chemical process, and enumerate the various categories of attack remote maintenance capabilities [3].
vectors and their practical applicability on hardware controllers The use of COTS components in critical infrastructure
in a Hardware-In-The-Loop testbed. Leveraging the observed im- settings is attractive since it provides the immediate benefit
plications of the categorized attacks on the process, as well as the of robust hardware and stable, readily available software
profile of typical disturbances, we follow a data-driven approach
to detect anomalies that are early indicators of malicious activity. modules. At the same time, however, vulnerabilities discov-
ered in COTS products can be promptly ported to industrial
environments, extending the cyber-security threat landscape
I. I NTRODUCTION of ICS [4]. In addition, common IT protocols used for ICS
Automatic control systems ensure the stable operation of communication have known vulnerabilities and exploitation
industrial environments and provide monitoring and manage- techniques, enabling elaborate attacks. Even the assurances of
ment capabilities for the underlying physical processes. Exam- air-gap networks are not adequate against motivated attackers,
ples of industrial environments include water treatment and as demonstrated by Stuxnet [5]. Cyber-attacks against ICS are
water desalination plants, assembly lines and manufacturing happening at an alarming pace. In 2014, the ICS Cyber Emer-
processes, chemical processes, and electric power systems. gency Response Team (ICS-CERT) received and responded
The nature and significance of these environments render them to 245 incidents in the US, whereas in 2015 the number of
parts of critical infrastructure. incidents reported grew to 295 [6], [7]. At the same time, the
These industrial processes and their associated control sys- ICS security market is expected to grow to $11.29 billion by
tems are typically referred to as Industrial Control Systems 2019 [8]. Table I aggregates information on high-impact ICS
(ICS). The two major types of ICS with regards to the attacks from 2000 to date.
nature and topology of the controlled industrial process are: ICS security has been traditionally handled using network
i) Distributed Control System (DCS), where the system is security and conventional IT security practices. ICS security
divided into distributed and decentralized subsystems each goals, however, differ greatly from traditional IT security
responsible for its own local process, and ii) Supervisory goals. Straightforward adoption of IT security solutions fails
Control and Data Acquisition (SCADA), where the control to address the coupling between the cyber and physical
of the entire system is centralized and the system typically components in an ICS [14], as well as the demand for high
spans over a large geographical area [1]. availability of the monitoring and control functions [15]. For
Over the past years the hardware and software components example, while an email system can afford short delays in
of ICS are being upgraded, towards a more modern and delivering messages, a short disruption of the control process
“smart” critical infrastructure that has increased efficiency, in an ICS can have devastating effects ranging from environ-
controllability, and reliability. The addition of computing ca- mental disasters to significant financial losses, or even loss of
pabilities and inter/intra-connectivity to ICS promise lower life.
production and maintenance costs, faster emergency response In this paper, in order to address the cyber and physical
times, fewer incidents, and shorter downtimes. This mod- coupling in ICS, we develop a process-aware supervised
ernization trend is enabled by the proliferation of cheap learning defense strategy that takes into consideration the

Paper 12.2 INTERNATIONAL TEST CONFERENCE 1


978-1-4673-8773-6/16/$31.00 2016
c IEEE
TABLE I
T IMELINE OF HIGH - IMPACT ATTACKS TARGETING ICS
Description Impact
In 2000, the SCADA system that controlled a Queensland sewage treatment plant was About 800,000 liters of raw waste were pumped into a
accessed and controlled by a former employee of the software development team, after his nearby river and the grounds of a resort hotel, killing
job application was rejected [9]. wildlife and plants.
In 2008, the control and monitoring system of Baku-Tbilisi-Ceyhan oil pipeline in Turkey An explosion on the pipeline caused more than 30,000
was attacked by a terrorist organization. The attackers gained the entry onto the system barrels of oil to spill in an area above a water aquifer,
by exploiting the vulnerabilities of the camera communication software, and then disabled and cost British Petroleum and its partners $5 million
the alarm system and manipulated the pressure in the pipeline [10]. a day in transit tariffs during the closure.
In 2009, the system responsible for detecting pipeline leaks for oil derricks off the The monitoring system was disabled temporarily. The
Southern California coast was hacked by a disgruntled employee [11]. coastline was exposed to environmental disasters.
In 2010, the Stuxnet computer worm infected the software of at least 14 industrial sites The worm spied on the operations of the target system. It
in Iran, including a uranium enrichment plant. It was introduced to the target system then used the information it had gathered to control the
via USB drives and then repeatedly replicated itself [12]. centrifuges, forcing them to tear themselves apart.
In 2014, sophisticated attackers used spear-phishing and social engineering to gain access Outages on control components and production machines
to the office network of a steel plant in Germany, from which the attackers then broke prevented the plant from properly shutting down a blast
into the organization’s production network [13]. furnace, resulting in significant physical damages.

operational behavior of an ICS to detect attacks in real-time.


To better understand the requirements of such a detection
module, we analyze the different categories of attacks, and
study their implications on the ICS process. The accuracy of
our analysis is enhanced by the inclusion of actual hardware
in a Hardware-In-The-Loop (HITL) setup, which introduces
realistic disturbances to the simulation model and also enables
us to demonstrate a complete payload delivery mechanism.
The contributions of this paper are twofold:
• Exploration of the attack surface for process-aware pay-
loads and demonstration of complete ICS attack vectors
in a controlled lab setup.
• Development of a real-time robust machine learning clas-
sifier that can detect several possible abstract categories of
attacks and distinguish them from process disturbances,
Fig. 1. Simplified layout of ICS architecture
and redundancy-based mitigation through automated con-
trol switching.
The characteristics of the underlying regulated dynamic
The rest of the paper is organized as follows: Following
processes depend on the application domain, and can include
the problem formulation in Section II, Section III presents the
continuous-time or discrete-time dynamical systems (that are,
methodology used in this paper. We provide an overview of
in general, time-varying and non-linear), hybrid combina-
the utilized benchmark process in Section IV, followed by an
tions of continuous-time/discrete-time, discrete-event or event-
exploration of the different attack categories in Section V. We
driven dynamics, and combinations of multiple dynamic com-
present our experimental setup in Section VI and justify our
ponents in a centralized or distributed structure. Furthermore,
decision for incorporating hardware devices in a HITL testbed.
the control system implementations can also be of a wide
Section VII describes two process-aware attacks including the
variety of structures. However, the fundamental concept in
end-to-end payload delivery mechanism, and the attacks’ im-
any control system component is a feedback interconnection
pact on the overall process. Details and experimental results of
utilizing sensors and actuators, wherein real-time information
our proposed online machine learning detection and mitigation
from one or more sensors is utilized to compute commands to
strategy are presented in Section VIII. We compare this work
one or more actuators. For example, in the case of a chemical
with related work in Section IX, and conclude the paper in
process, the real-time reading of the pressure in a reaction
Section X.
vessel can be utilized in a feedback loop to compute real-
II. BACKGROUND AND PROBLEM FORMULATION time commands to a valve. The valve regulates the reaction
This paper focuses on the development of attack detection vessel pressure to a desired value, also known as a setpoint
and mitigation strategies for ICS, with emphasis on control in control theory terms. While there is a considerable variety
system implementations on Programmable Logic Controllers of feedback control methodologies, the Proportional-Integral-
(PLCs). ICS are comprised of one or more dynamic processes, Derivative (PID) controller is the most commonly used control
sensors, actuators, computational components such as PLCs structure in industrial applications.
that implement and execute control algorithms, and commu- Depending on the application domain, feedback control
nication components. Fig. 1 depicts a simplified layout of ICS implementations can span a range of time scales, from a few
components and their interconnections. milliseconds for fast electromechanical systems, to several

Paper 12.2 INTERNATIONAL TEST CONFERENCE 2


seconds for slow chemical processes. The controller param- A common assumption for both approaches is that an attack
eters (e.g., the PID gains) are tuned based on closed-loop will have observable impact. However, the first approach
performance objectives. The numerical values of the controller additionally assumes that a representative mathematical model
parameters also depend on the time step utilized in the of the system is available, whereas such a model is not required
controller implementation. ICS can have multiple control loops by the second approach. Furthermore, dynamic models often
possibly implemented on multiple controller computational have high complexity and are specific to the instantiation of
platforms, with each of these loops regulating different parts of the system considered in the analysis, in contrast to machine
the overall process. Furthermore, feedback control components learning algorithms that tend to be less complex and can more
can be interconnected in cascade or parallel combinations. efficiently generalize and scale to defend an updated version
Since the proper operation of an ICS depends on the of the system or another ICS altogether. For these reasons,
real-time feedback control loops, an ICS can be susceptible we adopt a machine learning defense approach in this work,
to various types of process-aware attacks that attempt to and train a module to detect whether the process has been
hamper system performance and stability, by modifying any maliciously modified or tampered with.
one of the constituent components, or a combination thereof.
The potential points of interest of an ICS from a security C. Threat model
perspective are: The threat model considered in this paper is similar to that
• Sensor components of the well-known Stuxnet worm [5]. In our threat model, we
• Computational components (i.e., controllers) assume 2 malicious entities:
• Actuator components The attack designer: A technically capable adversary, who
• Communication from controllers to actuators has partial prior knowledge and implementation specifics of
• Communication from sensor to controllers the target process, equipment, and sufficient budget. The
• Remote communication mechanisms attack designer can for example be a medium to large sized
corporation, or a nation state.
A. Process-aware attacks
The attack launcher: A technically incapable adversary, who
In comparison with generic attacks that target the com- has physical access to the facilities and ICS devices. The attack
putational or communication elements, process-aware attacks launcher can for example be low-level staff at the ICS facility
attempt to utilize knowledge of the dynamic process being who has either financial motives, or was blackmailed to carry
controlled, typical sensor and actuator time signals, control out the attack. The attack launcher possesses a “box”, which
algorithms and implementation mechanisms, to negatively automatically carries out the attack when connected to the
impact the closed-loop system. While the attack payload is system without the need for user input, effectively delivering
process-aware, the entry points to the system are generic a Red-Team-In-a-Box (RTIB) attack.
vulnerabilities of the computational or communication COTS
components. III. M ETHODOLOGY
By leveraging vulnerabilities in the computational units that
implement the real-time controllers, an attacker can modify To develop a process-aware defense strategy tailored for
control parameters, time step settings, or trigger conditions in ICS, we follow a full vulnerability assessment cycle. In
an event-driven controller. Modification of the control logic general, the goal of vulnerability assessment is to identify
and control parameters can, in general, be targeted to generate potential threats to a system, analyze risks in order to provide
various types of effects on the closed-loop system, such as mitigation techniques, and prevent any deviation from the
loss of performance or system stability, slow effects over a system’s expected operation. In the case of ICS, the steps of
longer time interval, effects under a specific trigger condition, such an approach can be:
modified system behavior, etc. Reconnaissance: Gather information regarding the ICS com-
ponents and configuration details.
B. Process-aware defenses Vulnerability discovery: Find possible attack entry-points,
To guard against process-aware attacks and counteract their implementation weaknesses, and configuration flaws.
effects, process-aware resiliency mechanisms are crucial in Attack vectors and impact analysis: Formulate attack vec-
addition to and in conjunction with best-practice security tors, develop exploits, and analyze their impact.
methods for the computation and communication components Mitigation: Develop detection and mitigation techniques for
in the system [3]. The two main categories of defense strate- the attack vectors.
gies for ICS are i) approaches based on dynamic models We adopt this vulnerability assessment approach because it
of the system, and ii) machine learning-based approaches. enables us to explore the categories of attacks against any
Dynamic model based approaches utilize the existing model arbitrary ICS system in a structured manner, and provides
of a system, and aim to detect anomalies that do not conform better understanding of the nature and impact of attacks. This
to the dynamic equations and control laws that govern the attack-oriented analysis provides insight into the implications
system behavior. Machine learning algorithms train models to of attacks on the overall process, enabling the formulation of
detect deviations from the normal operation of the system. requirements for a detection module.

Paper 12.2 INTERNATIONAL TEST CONFERENCE 3


TABLE II
TE OPERATION MODES
Mode G to H Mass Ratio Production Rate (kg/h)
1 50/50 14076
2 10/90 14076
3 90/10 11111

Based on market demands, the TE process can work in three


modes of operation, which are defined by the mass ratio of
G to H in the product, and the product rate. These modes of
operation are listed in Table II. Without loss of generality, in
this work, we focus on the first operation mode. The primary
control objective of the plant is to maintain the mass ratio of
G to H in the product, while satisfying equipment constraints.
We build upon the MATLAB Simulink model of the TE
process provided in [17]. We extended the simulation model
Fig. 2. Tennessee Eastman process schematic
in order to investigate process-aware attacks and defenses.
Furthermore, we incorporated a serial hardware interface that
As a benchmark model, we use a dynamic model of a enables communication between the simulation model and a
complex, non-linear process, namely the Tennessee Eastman PLC, effectively realizing a HITL testbed. The Simulink model
(TE) chemical process. We integrate hardware PLCs in a HITL does not consider fast dynamics of some components within
experimental testbed, which adds realistic disturbances to the the MATLAB simulation (e.g., transmitter lags), but retains
simulation model. Furthermore, the existence of hardware is realistic behavior of gas phase dynamics and valve lags. The
essential for the development and demonstration of a complete instantiation of the simulator considered in this paper includes
attack vector with a payload delivery mechanism, following 50 states, 41 measured variables with Gaussian noises, 12 ma-
steps similar to ones an attacker would when targeting an ICS, nipulated variables, and 13 disturbance signals. A distributed
including reconnaissance and vulnerability discovery. control approach is implemented via 18 Proportional-Integral
For the detection module we follow a supervised learning (PI) controllers, and the simulation time step is set to 1.8
approach, and in particular non-linear Support Vector Ma- seconds. Note that the large number of measured variables is
chines (SVMs). We develop a robust SVM model that can due to the fact that some variables were included in the original
differentiate between disturbances during normal operation simulation model for monitoring and research purposes, but do
and malicious activity, and detect attacks shortly after their not have any impact on the control loops.
deployment on the system. Mitigation of detected attacks is V. ATTACK CATEGORIZATION
achieved through automated control switching between redun-
dant controllers. Although this redundancy-based strategy may In this section, we present the different categories of pay-
not be cost-efficient, we argue that incident response time loads that can be launched in an ICS environment. The section
and overall system performance are superseding factors to serves two main purposes: i) exploration of the different
controller costs. In addition, judicious selection of a subset options an attack designer may consider when designing a
of controllers that are critical to the process can reduce these payload, and ii) generation of comprehensive data for the dif-
duplication costs. ferent abstract categories of attacks that will be subsequently
used for training a supervised learning detection module.
IV. B ENCHMARK ICS: T ENNESSEE E ASTMAN PROCESS Towards investigating process-aware payloads, we carried
out studies on the TE simulation model to understand the
The TE process shown in Fig. 2 is a complex, open-loop effects of attacks on the various control loops in the system.
unstable industrial process. We selected this process as a We have studied all control loops of the model. However,
benchmark model since it realistically encapsulates the dy- due to the large number (18) of distributed control loops
namic behavior of real chemical processes [16]. It is composed in the process, showing all possible combinations of attacks
of five operation units: a reactor, a product condenser, a vapor- and their implications on the complete set of loops is neither
liquid separator, a compressor, and a stripper. In this process, feasible, nor desirable. Instead, we focus on the implications
gaseous reactants, A through E, are fed to the reactor and of attacks on the overall process and present payloads that
a set of chemical reactions generates two liquid products, G target one specific control objective in the model, without loss
and H, and one liquid byproduct F. The reactor product stream of generality.
passes through the separator in order to condense the product. Based on attack objectives and vulnerabilities of the specific
The non-condensed products are recycled back to the reactor implementation of an ICS, different components of a control
through the condenser unit. The condensed products are then loop may be more attractive from a security perspective.
fed to the stripper unit where liquids G and H are removed. Attacks can be divided in three distinct categories, namely
The byproduct is also purged from the stripper. sensor attacks, actuator attacks, and controller attacks. In the

Paper 12.2 INTERNATIONAL TEST CONFERENCE 4


Fig. 3. Stripper level response under sensor attack Fig. 4. Performance indices of the system under sensor attack

following subsections, we present attack examples, as well the control law itself. For example, modifying the controller
as their impact for all three categories. We utilize reactor PID gains directly influences controller performance, and will
pressure as a case study, as it is one of the most important possibly result in deterioration of the overall performance of
variables of the TE process. Variability in reactor pressure the process. In our case, the attack designer may change either
and temperature can result in instability of the process. Small the proportional or integral gain of one of the PI controllers of
increases in pressure can halt the entire process, since the the TE process. One example would be a change in the gains
optimal operational value for minimizing production cost is by a multiplication factor:
set to 2800 kPa, very close to the shut down limit of 3000 kPa.
k̃i = λki . (2)
Decreasing the reactor pressure leads to increased production
costs. These factors render the reactor pressure control loop where λ is a constant, ki is the original designed gain, and k̃i
attractive for attackers interested in negatively affecting the is the modified gain value.
efficiency and stability of the system. For all ensuing scenarios The effects of one such controller attack, where the propor-
in this section, G production setpoint and production rate tional gain of the reactor pressure PI controller is multiplied
setpoint are set to 53.8% and 23m3 /h respectively, and all with a constant numerical value, are shown in Fig. 5 and
attacks are launched at t = 10h. Fig. 6. The attack results in decrease of the reactor pressure,
which has a direct negative effect on the operating cost of
A. Sensor Attacks
the process. Although this attack does not have a large impact
In sensor attacks, the attacker modifies/spoofs a sensor on the product quality, it significantly increases the operating
reading to affect closed-loop system operation. One exam- cost.
ple of such an attack is modification of the sensor value
in a continuous manner, starting with a slow increase, and C. Actuator Attacks
increasing the rate of variation while the attack progresses. The final category of attacks targets actuator values, in
The mathematical model of such an attack is: which the payload modifies the actuator values to disrupt
the system’s operation in a manner difficult to detect, since
ỹ(t) = ysp + αeβ(t−τ ) (1) the actuator values are typically the ones sent to the control
where ysp is setpoint value of the output, τ is the launched center for monitoring purposes. One example of an actuator
attack time, and α and β are tuning constants. targeting payload is the addition of a small time-varying
Fig. 3 and Fig. 4 show the effects of a payload falling bias to the actual actuator value to disguise the attack, and
under this category, launched against the stripper level sensor. slowly deteriorate the system performance without causing
The attack influences the process slowly at first, but its any instability outside the process’s operational boundaries.
effect increases exponentially over time. Under this attack, The mathematical model of such a payload is:
the stripper level reaches the high shutdown limit after 5.65h. ũ(t) = u(t) + a sin(ωt) (3)
Moreover, the production rate and operation cost deviate from
their setpoints during the attack. Note that maintaining G where a and ω are constant values, u(t) is actual actuator
production percentage and production flow rate constant, while value, and ũ(t) is the modified actuator value.
satisfying safety constraints, are important objectives of the TE The effects of this attack on the separator level control
process. loop are shown in Fig. 7 and Fig. 8. The percentage of G
production, which is the product quality metric, has oscillatory
B. Controller Attacks response under this attack model. Product quality oscillation is
The second category of payloads targets controllers, and a very undesirable process behavior. Additionally, minimizing
modifies the control parameters of the process, or ultimately valve movements is one of the control objectives of the TE

Paper 12.2 INTERNATIONAL TEST CONFERENCE 5


Fig. 5. Reactor pressure response under controller attack Fig. 7. Separator level response under actuator attack

Fig. 6. Performance indices of the system under controller attack Fig. 8. Performance indices of the system under actuator attack

process. This attack model causes an oscillatory response for hardware in the experimental setup enables a more thorough
the separator flow valve position (shown in Fig. 7), which investigation of the system’s security, as well as formulation
will result in faster wear-out of the valve and subsequent of complete attack vectors, including payload delivery mecha-
decommission. The characteristics of this attack are similar nisms [18]. For the aforementioned reasons, we adopt a HITL
to the ones of Stuxnet’s payload that destroyed centrifuges by experimental setup in this work.
forcing an oscillatory response, reducing their life-span. The experimental HITL testbed we developed for studying
the TE process is depicted in Fig. 9. The Simulink model
VI. E XPERIMENTAL SETUP : HITL TESTBED described in Section IV was modified by removing one of its
An important consideration when performing vulnerability control loops, and implementing the equivalent model on a
assessments is proper selection of the assessment environ- PLC unit. The control loop offloaded to the hardware PLC
ment. Assessment environments may include software-only is a cascade of two PI controllers driven by two sensors, re-
simulation models, production testing, or setup replication, sponsible for controlling the reactor’s pressure and purge rate.
each with its advantages and disadvantages. For example, The cascaded PI-to-PI controller implemented on the PLC was
while production testing and setup replication provide the tuned to closely match the behavior of its computer-simulated
most accurate results, they are not viable options for ICS. analog. The numerical results from the HITL simulator for
The former is inherently hazardous, given the interactions of any process initial condition and disturbance conditions are
ICS with the physical world, and the latter is cost prohibitive very similar to the pure simulation, but also include noise
as it requires duplication of every component in the system. and errors due to multiple practically relevant hardware-related
Software simulations have very low design costs, but fail to effects (e.g., random noise, baseline drift on analog signal
capture the complexity of ICS and cannot recreate the real- lines, quantization effects on analog I/Os).
world conditions and interactions of cyber-physical systems. In terms of hardware, the primary PLC unit used in the
Hybrid methods try to address this trade-off by including HITL testbed is the Wago 750-881, because it is a good ex-
one or more hardware components connected to a software ample of the transition from legacy-based structures to modern
simulation model in a HITL setup. This approach inherits technologies. The Wago 750-881 features a 32-bit ARM CPU
the low design cost benefit of software simulations and the running a Nucleus RTOS, and a 32KB non-volatile memory
realistic disturbances that hardware inclusion contributes to a which holds the ladder logic files. The RTOS includes a web-
system. Moreover, from a security perspective, existence of server and FTP service. In terms of networking, the Wago

Paper 12.2 INTERNATIONAL TEST CONFERENCE 6


the progress is performed over the PLC’s Ethernet port, using
Wago’s CODESYS development environment. In addition to
incorporating the Wago PLC in our HITL setup, we also
implemented an equivalent interface with a Siemens S7-300
PLC unit, which serves as a back-up controller for the needs
of our mitigation strategy.
VII. D EVELOPMENT AND IMPACT OF ATTACK VECTORS
In this section, we demonstrate the development of a
complete attack vector, and the impact of two process-aware
payloads on the overall system. We argue that the steps
detailed here are similar to steps an attack designer would
follow when designing a payload and delivery mechanism.
A. Payload delivery mechanism
In order to compromise the PLC, an attack designer has to
find an entry point that does not disrupt the normal operation
of the system. The Wago 750-881, as described in the previous
section, has two Ethernet ports. Given that our threat model
allows for physical access, and under the assumption that one
Fig. 9. Experimental HITL testbed configuration port is utilized for connection between the controller and the
control center and the other is not occupied, we investigated
supports HTTP, SNTP and SNMP protocols for diagnostics the feasibility of concurrent connections to the PLC. Our initial
and management, and EtherNet/IP and Modbus for Fieldbus finding was that the CODESYS environment does not allow
communication. Two Ethernet ports and a serial interface allow concurrent connections to the same PLC from two different
programming and management of the PLC. machines. However, by reverse engineering the proprietary
For communication between the PC simulation model and communication protocol between CODESYS and the PLC
the PLC we utilized a Serial-Interface Board (SIB) with with the help of Wireshark, and establishing a connection to
analog-to-digital (A/D) and digital-to-analog (D/A) conver- the PLC outside CODESYS, we discovered that concurrent
sion capabilities. Packets containing values corresponding to connections are possible. To avoid crashing the PLC, we force
the measured signals, which normally feed into the reactor a 1ms delay between transmission of successive packets.
pressure and purge rate PI-loops, are transmitted from the Disassembling the firmware of the Wago 750-881 verified
simulation host PC to the SIB over a USB-to-serial connection, that our PLC suffers from the same vulnerability reported
using Simulink’s serial interface functionality. All data are for other Wago products (CVE-2012-3013). Particularly, the
transmitted as 32-bit, single-precision floating point values. firmware includes 3 hardcoded credentials which we can lever-
Before packetization, measurement values are scaled to ac- age to perform privileged operations, such as file operations
commodate the SIB’s D/A output voltage range of 0 to 3.3 and controller reset [19]. Furthermore, the FTP service on
volts. Two D/A channels are routed from the SIB to analog the PLC does not require any authentication. To automate
signal amplifiers, which rescale these signals to the PLC’s communications, we developed scripts that mimic legitimate
analog input range of 0 to 10 volts. In a similar manner, the communication, and allow us to send commands and files to
output of the PLC is downscaled and transmitted via a D/A the controller, similar to the approach described in [20].
peripheral. This signal is then routed to an A/D channel on the A checksum file is sent to the PLC along with the ladder
SIB, which samples the signal and sends the resulting value logic files, for verifying the integrity of the transmitted data on
as serial data back to the Simulink model. This control value the PLC-side. Since a payload with modified ladder logic files
is converted to its original units, and applied as an actuation must pass this check, we analyzed the checksum algorithm.
signal, closing the simulated process loop. By comparing the checksum of different files and reverse en-
To achieve time-synchronization between the PLC and the gineering the CODESYS executable we derived the checksum
host PC we used a digital trigger. Each time the SIB receives algorithm to be a variation of the legacy SYSV algorithm.
a data packet from the Simulink program, it triggers a PLC The payloads we chose to deliver using the above attack
control loop update cycle. The PLC is programmed to read path fall under the category of controller attacks discussed
this signal via a digital input peripheral, and subsequently in Section V. In particular, we developed two payloads that
execute a single control loop update cycle. Once the SIB board modify the proportional and integral gains of the PI controllers
has sampled the updated PLC controller output, the acquired implemented on the PLC respectively. Conforming to the
control-loop output data is sent back to the Simulink host PC, considered threat model, we automated the attack using an
as previously described. Ubuntu BQ Aquaris E4.5 mobile phone that plays the role
Programming the ladder logic on the PLC and monitoring of the RTIB. We additionally developed an application on the

Paper 12.2 INTERNATIONAL TEST CONFERENCE 7


the typical time signal patterns of the dynamic processes and
their dynamic characteristics can be utilized to ascertain if
an attack is on-going. One such process-aware approach is
machine learning-based clustering. In particular, a general and
flexible clustering approach is the Support Vector Machine
(SVM) that utilizes machine learning-based techniques to
generate a classifier that can cluster input data into one of
several categories.
For the needs of our detection mechanism, we trained a
primary SVM to detect the presence of an on-going attack,
Fig. 10. Reactor pressure before and after proportional gain attack and a bank of secondary, separate SVMs to detect specific
categories and types of attacks. SVMs provide significant
robustness properties beyond simple range-based classifiers.
By utilizing learned knowledge of the run-time interdepen-
dencies between multiple information streams, an SVM can
provide high detection accuracy. In comparison, simple range-
based attack detectors suffer from either high number of false
positives when the ranges are set narrowly enough to detect
attacks reliably, or high number of false negatives when the
ranges are widened to reduce false positives.
As with any classifier methodology, the accuracy of an SVM
Fig. 11. Reactor pressure before and after integral gain attack crucially depends on how well it is trained. In the context of
phone, which detects when the phone is connected to a Wago the application considered here, the training set for the SVM
PLC and automatically launches an attack. The technically included large data sets from normal operation of the system,
incapable attack launcher is only required to plug the phone on as well as comprehensive data sets under various categories
the PLC, remaining completely oblivious of the inner workings of attack conditions. Furthermore, to accurately discriminate
and underlying mechanisms of the attack. between typical disturbance conditions under normal operation
The complete attack vector is outlined as follows: and attack conditions, the training set included data sets
under the various disturbance conditions. Our attack detection
• Connect RTIB to PLC via unused Ethernet port
module utilizes data from the complete set of 12 sensors of the
• Establish communication link
TE process. Based on the accessible signal set y, the classifier
• Login using hardcoded credentials
operates on a sliding time window of the measurements from
• Download ladder logic over FTP
that signal set, i.e., the classifier computation at time k utilizes
• Modify gain variables in ladder logic
y[k], y[k − 1], . . . , y[k − N + 1], with N being a positive
• Calculate checksum of new binary
integer wherein N = 1 corresponds to the simplest case,
• Send modified files to PLC via FTP
in which the classifier operates on measurements from each
• Force-reload of boot project
time step separately. To provide robustness to naturally noisy
B. Attack results data and reduce false positives, the classifier output yc is time
Fig. 10 and Fig. 11 show the results of the two attacks, both windowed to generate the overall detection signal s based on
launched at t = 20h. The first attack replaces the original the time sequence yc [k], . . . , yc [k − Ns + 1] with Ns being a
proportional control gain of the Purge Rate PI control loop positive integer. In the SVM-based classifier implementation
with a slightly larger value. This attack causes the controller’s here, a sliding window with Ns = 50 is utilized to provide
performance to degrade, eventually resulting in lower reactor robustness against false alarms under disturbances and sensor
pressure as seen in Fig. 10, and thus higher operational costs. noise during normal operation, by only declaring an attack
The second attack, shown in Fig. 11, replaces the integral gain detection when more than 90% of the classifier outputs during
of the same control loop with its inverse value. This affects the sliding time window correspond to attack detections.
the reactor pressure, causing it to slowly rise, and eventually Utilizing the RBF kernel and data sets from normal oper-
exceed the shutdown limit of 3000 kPa, halting the entire ation and attack scenarios above under various disturbance
process. In both cases, we can observe significant alterations conditions, a set of SVMs was trained with N = 1 and
to the process outputs as a result of the attacks. Ns = 50 to firstly detect an attack and secondly identify
which specific type or category the attacks falls into. We
VIII. P ROCESS - AWARE DETECTION AND MITIGATION present one of the SVM test vectors from the sensor attack
The attacks described in Section V will be used to train a category in Fig. 12. To validate the robustness of our trained
machine learning-based attack detection module, which is the models, we applied an unknown payload in addition to a
ultimate goal of this work. To detect and counteract malicious previously unseen disturbance. The disturbance was applied
activity on an ICS, model-based and empirical knowledge of on the A/C-ratio with constant B composition during time

Paper 12.2 INTERNATIONAL TEST CONFERENCE 8


Fig. 13. Reactor pressure under proportional gain attack with SVM mitigation

Fig. 12. SVM based attack detection under sensor attack; Process disturbance
injected between 7h and 8h, attack injected at 15h.

t = 7h to 8h, with its effect remaining on the system until


about time t = 10h, while a malicious payload was launched
at time t = 15h. The trained SVM detected the abnormality in
the system 0.2h after the attack started. The SVM output is
shown in Fig. 12 as a 0 or 1 with 1 denoting attack detection.
This demonstrates that our detection SVM can accurately
discriminate between a disturbance condition appearing during
normal operation and an attack condition. Note that under Fig. 14. Reactor pressure under integral gain attack with SVM mitigation
such disturbance conditions, a simplistic range-based classifier
from the same vulnerabilities. This architecture introduces
based on maximum and minimum thresholds would either
redundancy and decreases the overall susceptibility of the
trigger false positive alarms, or fail to detect the attack,
control system to attacks, while at the same time increasing
depending on the looseness of the thresholds.
the complexity and effort required by an attacker. Moreover,
A. Attack detection and mitigation experimental results in the event of false positive alarms, the control is switched to
We integrated the SVM-based process-aware detection al- an identical controller (from a control theoretic perspective),
gorithm described above into the developed HITL testbed to reducing the severity of misclassifications.
detect abnormalities at run-time, and tested the model for
IX. R ELATED WORK
two different scenarios of controller attacks. The attacks used
the payload delivery method described in Section VII, while Due to the increasing complexity and ubiquity of current-
the two payloads were controller attacks that modified the day ICS applications, significant research effort has been
proportional and integral gains of the reactor pressure control undertaken on fault-tolerant and resilient control methods [21],
loop respectively. Note that neither payloads were used in the [22]. More specifically, machine learning based approaches
training phase of the SVM model. have been suggested for the TE process in previous works
During each test, the process was allowed to run normally for fault detection [23], [24]. In comparison, the work pre-
for 2 hours with the SVM-based detection module active. sented in this paper focuses on process-aware security of
At the 2 hour mark, we deployed an attack through the the process. We consider malicious modifications that have
RTIB (Ubuntu phone). Once the SVM detected that an attack been meticulously crafted and incorporate knowledge of the
occurred, the control was switched from the primary PLC process mechanics, in contrast to random faults that have a
(Wago 750-881) to a secondary PLC (Siemens S7-300) that significantly different impact on the process.
executed the same controller in parallel. The experimental Prior works have studied process-aware attacks and mitiga-
results of our two tests, presented in Fig. 13 and Fig. 14, tion methods [25]–[28], and simulation-based analysis of the
demonstrate the effectiveness and robustness of our defense effects of attacks and component failures in ICS applications
strategy. Both attacks are detected, and control is switched [29], [30]. These works follow a control-theoretic approach to
to the auxiliary PLC with a small time delay, such that the design and/or detect attacks, but do not take into consideration
overall process performance is not greatly affected. Note that noise and artifacts that hardware components can introduce to
the transients after the SVM detects an attack are due to the the process. In addition, to the best of our knowledge, this
switching between different controller instantiations, and the is the first work that demonstrates a complete attack and its
time the process requires to correct the short-term effects of impact, including the design of process-aware payloads and
an attack. payload delivery mechanisms in a HITL testbed.
The backup PLC should preferably be of a different make This work is closer to [31] and [32]. In [31], the authors
and model than the primary unit, making it less likely to suffer use a dynamic linearized model of the TE and follow a control

Paper 12.2 INTERNATIONAL TEST CONFERENCE 9


theoretic approach to formalize different types of attacks and [9] C. Blask, “ICS cybersecurity: Water, water everywhere,” [Online]:
develop state estimators that can detect malicious behavior. http://www.infosecisland.com/blogview/18281-ICS-Cybersecurity-
Water-Water-Everywhere.html, Nov. 2011.
However, that paper only considers sensor attacks, and in [10] J. Robertson and M. Riley, “Mysterious ’08 Turkey pipeline blast
addition uses a simplified, linear approximation of the original opened new cyberwar,” [Online]: http://www.bloomberg.com/news/
TE process. Similarly, the authors of [32] investigate a “secure articles/2014-12-10/mysterious-08-turkey-pipeline-blast-opened-new-
cyberwar, Dec. 2014.
control” methodology, applied on the TE model towards [11] D. Kravets, “Feds: Hacker disabled offshore oil platforms’ leak-detection
resilient control. The focus is on attacks, and specifically on system,” [Online]: http://www.wired.com/2009/03/feds-hacker-dis/,
sensor attacks, and while the authors briefly discuss possible Mar. 2009.
[12] D. Kushner, “The real story of Stuxnet,” [Online]: http://spectrum.ieee.
defense objectives, they do not provide a complete defense org/telecom/security/the-real-story-of-stuxnet, Feb. 2013.
strategy. In comparison to these works, we have utilized the [13] E. Kovacs, “Cyberattack on german steel plant caused significant dam-
non-linear TE model, and have explored attacks that span the age,” [Online]: http://www.securityweek.com/cyberattack-german-steel-
plant-causes-significant-damage-report, Dec. 2014.
entire set of components included in the process (sensors, con- [14] F. Khorrami, P. Krishnamurthy, and R. Karri, “Cybersecurity for control
trollers, actuators), providing a comprehensive process-aware systems: a process-aware perspective,” IEEE Design & Test, vol. 33,
detection and mitigation strategy. Furthermore, none of the no. 5, Oct. 2016.
[15] J. Weiss, “Assuring industrial control system (ICS) cyber security,”
discussed methodologies incorporate the realistic disturbances [Online]: http://csis.org/files/media/csis/pubs/080825 cyber.pdf.
that hardware inclusion contributes to the process. [16] J. Downs and E. F. Vogel, “A plant-wide industrial process control
problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245–
X. C ONCLUSION 255, 1993.
[17] N. L. Ricker, “Tennessee Eastman challenge archive,” http://depts.
In this paper, we demonstrated a process-aware defense washington.edu/control/LARRY/TE/download.html.
and mitigation strategy using supervised learning. We utilized [18] A. Keliris, C. Konstantinou, N. G. Tsoutsos, R. Baiad, and M. Ma-
niatakos, “Enabling multi-layer cyber-security assessment of industrial
a HITL testbed of the TE chemical process to investigate control systems through hardware-in-the-loop testbeds,” in 2016 21st
and understand process-aware attacks and their impact on Asia and South Pacific Design Automation Conference (ASP-DAC).
the overall process. We presented end-to-end attack vectors, IEEE, 2016, pp. 511–518.
[19] M. Gjendemsjø, “Creating a weapon of mass disruption: Attacking
including payload delivery mechanisms. This analysis was programmable logic controllers,” 2013.
made possible by the inclusion of hardware PLCs in the as- [20] D. Bond, “3S CODESYS,” http://www.digitalbond.com/tools/basecamp/
sessment environment. Using the knowledge obtained from the 3s-codesys/, 2012.
[21] P. Mhaskar, J. Liu, and P. D. Christofides, Fault-Tolerant Process
investigation of attack categories, we trained an SVM model Control: Methods and Applications. Springer, 2012.
that has the ability to detect abnormalities in real-time, and [22] M. S. Mahmoud and Y. Xia, Analysis and Synthesis of Fault-Tolerant
can distinguish between disturbances and malicious behavior. Control Systems. Wiley, 2014.
[23] A. Kulkarni, V. K. Jayaraman, and B. D. Kulkarni, “Knowledge incor-
Our model was able to detect all previously unseen tested porated support vector machines to detect faults in tennessee eastman
payloads with small delays, while not triggering false alarms process,” Computers & chemical engineering, vol. 29, no. 10, pp. 2128–
at the presence of disturbances during normal operation. 2133, 2005.
[24] Y. Zhang, “Enhanced statistical analysis of nonlinear processes using
ACKNOWLEDGMENT kpca, kica and svm,” Chemical Engineering Science, vol. 64, no. 5, pp.
801–811, 2009.
This project was supported by the U.S. Office of Naval [25] M. Burmester, E. Magkos, and V. Chrissikopoulos, “Modeling security
Research under Award N00014-15-1-2182; and by the NYU in cyberphysical systems,” International Journal of Critical Infrastruc-
ture Protection, vol. 5, no. 3-4, p. 118126, Dec 2012.
Center for Cyber Security (New York and Abu Dhabi). [26] C. G. Rieger, K. L. Moore, and T. L. Baldwin, “Resilient control sys-
tems: A multi-agent dynamic systems perspective,” IEEE International
R EFERENCES Conference on Electro-Information Technology , EIT 2013, May 2013.
[1] H. Ernie, M. Assante, and T. Conway, “An abbreviated history of [27] M. B. Line, A. Zand, G. Stringhini, and R. Kemmerer, “Targeted attacks
automation & industrial controls systems and cybersecurity,” SANS, against industrial control systems,” Proceedings of the 2nd Workshop on
2014. Smart Energy Grid Security - SEGS 14, 2014.
[2] R. Leszczyna, E. Egozcue, L. Tarrafeta, V. F. Villar, R. Estremera, and [28] F. Pasqualetti, F. Dorfler, and F. Bullo, “Attack detection and identifica-
J. Alonso, “Protecting industrial control systems recommendations for tion in cyber-physical systems,” IEEE Trans. Automat. Contr., vol. 58,
Europe and Member States,” 2011. no. 11, p. 27152729, Nov 2013.
[3] K. Stouffer, J. Falco, and K. Scarfone, “Guide to industrial control [29] T. Morris, A. Srivastava, B. Reaves, W. Gao, K. Pavurapu, and R. Reddi,
systems (ICS) security,” NIST special publication, vol. 800, no. 82, pp. “A control system testbed to validate critical infrastructure protection
16–16, 2011. concepts,” International Journal of Critical Infrastructure Protection,
[4] E. Byres and J. Lowe, “The myths and facts behind cyber security risks vol. 4, no. 2, p. 88103, Aug 2011.
for industrial control systems,” in Proceedings of the VDE Kongress, [30] M. J. Mcdonald and B. T. Richardson, “Position paper: Model-
vol. 116, 2004, pp. 213–218. ing and simulation for process control system cyber security re-
[5] N. Falliere, L. O. Murchu, and E. Chien, “W32. Stuxnet dossier,” White search, development and applications,” [Online]: http://cimic.rutgers.
paper, Symantec Corp., Security Response, vol. 5, 2011. edu/positionPapers/MichaelMcdonald-paper.pdf.
[6] ICS-CERT, “ICS-CERT year in review,” [Online]: https://ics-cert.us- [31] A. A. Cardenas, S. Amin, Z.-S. Lin, Y.-L. Huang, C.-Y. Huang, and
cert.gov/sites/default/files/documents/Year in Review FY2014 Final. S. Sastry, “Attacks against process control systems: risk assessment,
pdf, 2014. detection, and response,” Proceedings of the 6th ACM Symposium on
[7] ICS-CERT, “ICS-CERT monitor,” [Online]: https://ics-cert.us- Information, Computer and Communications Security - ASIACCS 11,
cert.gov/sites/default/files/Monitors/ICS-CERT%20Monitor Nov- 2011.
Dec2015 S508C.pdf, 2015. [32] M. Krotofil and A. A. Cárdenas, “Resilience of process control systems
[8] marketsandmarkets, “Industrial Control System (ICS) security market,” to cyber-physical attacks,” in Secure IT Systems. Springer, 2013, pp.
[Online]: http://www.marketsandmarkets.com/PressReleases/industrial- 166–182.
control-systems-security-ics, 2015.

Paper 12.2 INTERNATIONAL TEST CONFERENCE 10

You might also like