Control-Flow Attestation: Concepts, Solutions, and Open Challenges
Control-Flow Attestation: Concepts, Solutions, and Open Challenges
Control-Flow Attestation: Concepts, Solutions, and Open Challenges
proposals being made in recent years, the area remains fragmented, addressing different adversarial
behaviours, verification paradigms, and deployment challenges. In this paper, we present the first
survey of control-flow attestation, examining the core ideas and solutions in state-of-the-art schemes.
In total, we survey over 30 papers published between 2016–2024, consolidate and compare their key
features, and pose several challenges and recommendations for future research in the area.
Table 1 begin introducing the core ideas behind CFA in §3, such
An overview of related survey papers. as definitions and the knowledge areas of state-of-the-art
Work Year Description
proposals. We then systematically discuss each area of this
Steiner & Lupu [15] 2016 RA methods for wireless sensor networks.
taxonomy, discussing prover-verifier paradigms in §4; trust
Arias et al. [16] 2018 A brief review of device attestation methods. anchors in §5; and methods for instrumenting and measuring
Sfyrakis & Groß [17] 2020 Hardware methods for attesting network infrastructures.
Johnson et al. [18] 2021 A review of RA schemes for embedded systems. proving programs in §6 and §7. We consolidate existing CFA
Ankergård et al. [19] 2021 State-of-the-art software-based RA for IoT devices. proposals according to these features in §8 and present a
Kuang et al. [20] 2022 A general review of RA schemes for IoT devices.
Ambrosin et al. [21] 2022 Collective RA schemes for large IoT networks. comparison table to aid future research in the area. Lastly,
This Paper 2024 A comprehensive analysis of control-flow attestation. we present several open research problems and recommen-
dations in §9 before concluding the paper in §10.
of control-flow graphs, and control-flow and non-control- be considered a misnomer in graph theory. A path is a trail in which all
vertices (and thus all edges) are distinct, whereas a walk is more general: it
data attacks addressed by CFA schemes. After this, we is a finite or infinite sequence of edges which joins a sequence of vertices.
1. CFG construction: The target program’s CFG is com- e.g. through improper input validation and sanitisa-
puted using static or dynamic analysis or a combi- tion, which is subsequently executed.
nation thereof. Static analysis uses the program code
using its binary, source code, or an intermediate form Stack-based buffer overflows are a canonical example,
(cf. LLVM’s intermediate representation [24]). This which involves overwriting function return addresses and
process is used to identify all possible control flow other variables on the stack to redirect execution to code
transfers before its execution. In contrast, dynamic to a malicious payload (e.g. shellcode). Heap-based attacks
analysis builds the CFG through giving inputs and are another approach, involving overflowing buffers on the
logging program execution. heap and affecting adjacent heap metadata and objects (e.g.
function pointers) that redirect control flow.
2. CFI enforcement: This is performed at runtime and More modern attacks involve the concept of code reuse,
involves a reference monitor that checks and enforces which rely on unauthorised sequences of existing authorised
the program’s adherence to the CFG derived in the code fragments in order to realise malicious functionality.
previous step. A common CFI enforcement method is
through program instrumentation [1]. Here, a check- (A.2) Code reuse attacks: Redirects a program’s control
point 𝐶(𝑝, 𝑡) is inserted at a control transfer instruction flow to execute existing executable code sequences,
at point 𝑝 in BBL, with a target, 𝑡. Here, 𝑝 and 𝑡 or gadgets, in order to perform malicious operations.
represent a logical CFG edge between one BBL 𝑣𝑖 and These attacks are highly flexible and can be used to
a target BBL, 𝑣𝑗 . Before executing any control transfer bypass common protections, e.g. DEP and W⊕X,2
instruction, the program passes control to the refer- by using executable code already present in memory.
ence monitor that verifies whether 𝐶(𝑝, 𝑡) is associated
with (𝑣𝑖 , 𝑣𝑗 ) ∈ 𝐸. That is, the monitor ensures that the A common example of a code-reuse attack is return-
transfer complies with the CFG. If the check fails, the oriented programming (ROP) [26], which relies on return
monitor may half the program’s execution, throw an instructions to chain together a sequence of gadgets. Each
exception, or take another enforcement action. gadget typically concludes with a return instruction; the next
gadget in the sequence starts based on the tampered return
In practice, control-flow transfers are instigated by in- address of the previous one. It is widely known ROP attacks
structions that alter the normal sequential execution flow, can achieve Turing-completeness [27]. Jump- (JOP [28]) and
directing it from the current instruction to a different, non- call-oriented programming (COP [29, 30]) attacks are two
consecutive address in the program. A control-flow transfer other code reuse attacks, which involve tampering forward
has two main attributes: a source address and a destination control-flow transfers, namely indirect jumps and indirect
address. Transfers are divided into forward and backward calls respectively, in order to chain sequences of gadgets.
ones, which redirect execution to a new location and re-
turn execution to the prior location respectively. On X86 2.1.2. Non-Control Data Attacks
platforms, the call and ret instructions are two examples Rather than directly modifying control data, e.g. function
of such instructions. Forward control-flow transfers may pointers, non-control data attacks rely on the manipulation
further be categorised into direct and indirect calls and of more general data to indirectly change a program’s con-
jumps. Branching and jump instructions may be executed trol flow [31, 32, 33]. Detecting non-control-data attacks is
according to some condition, e.g. jz (jumps if the zero flag, more challenging because they technically traverse legiti-
ZF, is set) and jnle (jump if not less or equal). mate edges of the program’s CFG, even though malicious
functionality ensues as a consequence. Let us consider a
2.1. Control-Flow Attacks motivating example. An adversary may execute arbitrary
Control-flow integrity and attestation schemes aim to commands on a Linux system if they can influence the
address the problem of control-flow attacks that manipulate arguments to a execve() call in the target program. The
control-flow transfers in order to perform unauthorised ac- target program’s CFG is respected, even though the resulting
tions. In this section, we describe the common attack classes, behaviour is unauthorised. Data attacks can be realised in the
and label them for later comparison in §8. following ways:
2.1.1. Control Data Attacks (A.3) Direct data manipulation (DDM) attacks: Refers
The primary class are attacks that directly modify control to attacks that maliciously modify the data used in
data used to that directly influences the processor’s pro- decisions that affect a program’s control flow.
gram counter (PC) during execution by modifying return
addresses and function pointers [25]. A fundamental attack Branch variable attacks are an example of (A.3) in which
vector, which we refer to as code injection, involves inserting an adversary influences data used in determining the out-
malicious code into executable memory and redirecting the come of conditional branches, i.e. in ‘if’ and ‘switch’ state-
PC to that code. ments. The resulting data causes another legitimate branch
2 Write XOR Execute (W⊕X) is a security feature that ensures memory
(A.1) Code injection attacks: The attacker directly inserts
malicious code into the target program’s memory, regions are either writable or executable, but not both simultaneously.
2.2. Control-Flow Transfer Types (C.2) Conditional direct branches/jumps: Includes all
In CFI and CFA, monitoring different control-flow trans- branching and jump instructions, represented as as-
fer types is used to detect particular unauthorised paths sembly or machine code, that alter the execution se-
within a target program. We note that it is often undesirable quence based on a specific condition. If the condition
to monitor every possible control-flow due to the significant is true, then the program counter is directly set to
a specified address, continuing execution from that
3 It is important to recognise this semantic gap: just because legitimate
address. In direct jumps, the target address or label
edges of a CFG are followed does not mean that unintended or unauthorised
is explicitly stated in the instruction. The address is
behaviour more abstractly can be performed.
fixed at compile time and does not change at runtime. 3. Introducing Control-Flow Attestation
A control-flow attestation scheme can be considered as
(C.3) Unconditional direct branches/jumps: Transfers C = (𝑇 , , , , 𝑉 ). The target program, 𝑇 , is executed
the control flow to a specified address or label without on the proving device, . The terms ‘attested program’ [4],
any condition using a jump instruction. jmp is the fun- ‘attestation program’ [46] and ‘selected program’ [47] are
damental mechanism for performing unconditional used synonymously in the literature.4 The proving device
direct jumps in X86-64 assembly. may execute other programs during attestation, and so we
adopt the term ‘target program’ from [34] for clarity.
(C.4) Direct calls: A direct call is used to invoke a subrou- denotes the control-flow transfer types (e.g. direct
tine or function by specifying the name or address of jumps, indirect calls) being monitored, which were pre-
the subroutine in the instruction, e.g. call (X86-64). sented in §2.2. A general requirement is that CFA scheme
(C.5) Indirect branches/jumps: Indirect jumps allow the must be able to monitor unique information about a BBL
program to transfer control to an address that is com- associated with a single control-flow transfer type. During
puted at runtime, rather than a fixed address specified collection, this information may include the addresses of a
in the code. This can include jumping to an address BBL’s first and final instructions; the control flow transfer’s
stored within a register, e.g. jmp rax; on the stack; or source and destination addresses, which delineates different
at an address pointed to in memory. BBLs; a unique label identifying the BBL;5 and the type of
control-flow transfer and its complete instruction represen-
(C.6) Indirect calls: Similar to indirect jumps, indirect tation, e.g. as machine code.
calls invoke a function whose address is computed at is the facility used to create measurements of the
run-time or stored in a register or memory location. control-flow followed by 𝑇 . We cover measurement tech-
niques, such as cryptographic hashes of CFG nodes, in §7.
(C.7) Return instructions: Return instructions are used is the procedure for reporting the set of measurements
at the end of a function to return to the location of collected by to the verifier, . This procedure usu-
the PC before the function was invoked. On X86, ally involves encrypting and signing collected control-flow
the instruction ret pops the top value from the stack, measurements, forming an attestation report. Lastly, 𝑉 is
which should be the return address pushed onto the the verification procedure used by in order to determine
stack by a previous call instruction, before jumping whether the measurements correspond to those which are
to that address. expected, i.e. a known, authorised run taken by 𝑇 . A binary
decision model is assumed here: if the verification procedure
Some proposals apply different CFI policies depend-
fails, then it indicates the presence of adversarial behaviour
ing on the location of the code being examined with the
and that one of the attacks described in §2.1 was performed.
aim of reducing performance overhead. For example, by
Typically, 𝑇 , , and are run on , while 𝑉 is run on .
dividing code into critical and non-critical functions through is usually run on , but it may be used by both entities.
manual policy specification or using statistical and machine In Figure 3, we illustrate the knowledge areas that char-
learning-based models [5, 9, 11, 42]. For example, ARI [43]
acterise their core elements. We posit that schemes can be
divides code into critical and non-critical compartments
delineated according to their attestation paradigm; the trust
based on a policy set. All control-flow transfers are moni-
anchor; the instrumentation method for collecting measure-
tored except unconditional direct jumps within critical com-
ments; the measurement method; and, finally, the adversarial
partments, but transfers within non-critical areas are not model under consideration. The coming sections describe
considered. OAT [6] attests partial code regions that are the proposals and applications within each area.
specified (annotated) by programmers. Within the attested
code, conditional direct jump, indirect jump, indirect call,
and return instructions are covered. ACFA [44] monitors all 4. Prover-Verifier Paradigms
control-flow transfers within pre-determined code regions We recognise that the majority of CFA schemes operate
including any transfer from outside to inside the region. under an interactive paradigm, which is initiated and verified
Other approaches monitor all control-flow transfers within by a (remote) verifier using a challenge-response protocol.
areas that process sensitive data, rather than denoted code This follows a conventional model of remote attestation used
regions (see DIAT [45]). Tracking load/store operations on by trusted computing systems, such as the Trusted Platform
critical data in memory has also been used, thus bridging the
4 Terms like ‘program,’ ‘target,’ or ‘application’ are also used in-
areas of control flow and data flow, but this remains a nascent
terchangeably, which may be adequate where only executes a single
area of research [34, 45].
program, e.g. microcontroller units.
5 IDs are used to overcome issues with Address Space Layout Randomi-
(C.8) Mixed: Different granularity CFI policies are applied sation (ASLR) in which instructions’ memory addresses change with each
depending on code locations or developers’ annota- program execution. Using only instruction addresses, for example, cannot
tion of code to be attested, i.e. different control-flow delineate between different BBLs in this scenario. To solve this challenge,
the 𝑇 is instrumented to insert BBL IDs to mark the beginning of BBLs,
transfers may be monitored accordingly. which do not exist in the original target program.
Control-Flow
Attestation
discrete hardware modules. A number of academic sys- deployment. (A notable exception is the proposal by Li et
tems have been proposed with similar design goals (e.g. al. [49], discussed previously in §4.5, which investigates
TyTAN [52] and TrustLite [51]). collective CFA for peer-to-peer devices).
The trust anchor is commonly used to protect ’s mea-
surement procedure and any key material used in generating 5.1. Trusted Execution Environments
attestation reports. For some proposals, this requires that the A trusted execution environment (TEE) [57] serves as a
trust anchor passes traditional static attestation and that it has common trust anchor in existing CFA proposals. A TEE is a
dedicated memory to store material that can only the anchor hardened environment in which code and data can be hosted
itself can access [13, 34, 37, 41]. For proposals that rely and executed with hardware-enforced isolation. It aims to
on TEEs, it is assumed that potentially applicable high-cost protect the integrity and confidentiality of code and data
attacks cannot be performed by an adversary, such as hard- at run-time and at rest from kernel-level adversaries in the
ware fault injections and side-channel attacks [50]. TEE- normal, rich execution environment (REE), such as Android
based proposals sometimes place trust in certain operating and Linux. Communication between the REE and TEE is
system components, such as kernel modules that initiate the achieved through an API that performs the switches between
communication with the TEE [6, 38]. In proposals where the the REE and TEE worlds, and authenticates and authorises
kernel serves as ’s trust anchor, it is somtimes assumed that the REE application that invokes functions within the TEE.
a method for protecting cryptography keys is available, such Different TEE implementations have been used, but we
as a TPM or a proprietary facility (see Intel MPK in [58]). observe that Intel SGX [9, 10, 40] and ARM TrustZone [7,
Very few CFA proposals are purely based in software, 14, 45, 46, 49] are widely used choices. Intel SGX follows
and most systems have some degree of hardware depen- the enclave model of trusted execution, where applications
dency. We observe that CFA schemes can be split into three are partitioned into sensitive and non-sensitive regions. Sen-
categories based principally on where security-critical CFA sitive portions are initially loaded by the (untrusted) operat-
components are implemented on : ing system into an enclave, where dedicated CPU extensions
provide hardware-enforced access control to enclave-hosted
• Software-based systems implement all process com- regions when the enclave is executing. The reader is referred
ponents at the software level, typically as libraries in to work by Costan & Devadas [59] for a comprehensive
user space and modules in kernel space. introduction to Intel SGX.
• Hardware-based proposals rely on a dedicated hard- Intel SGX differs to the trusted world model used by
ware for performing the measurement and reproting ARM TrustZone, which relies on dedicated operating sys-
operations. This may be a discrete on-chip security tems for handling sensitive and non-sensitive applications.
module, an off-chip micro-controller, or a set of be- We note that TrustZone-A [7, 14, 46, 49] and TrustZone-
spoke CPU extensions. M [45] have been used, which are compatible with the
ARM Cortex-A and -M CPU series respectively. In general,
• Hardware-assisted systems use software whose secu- hosting the measurement and reporting procedures in a user-
rity is underpinned by a hardware root of trust. TEE- mode TEE application is a common approach; that is, the
based proposals are characterised as such, as well measurement and reporting procedures execute at S-EL0 in
as proposals that rely on read-only memory (ROM), the ARM exception model. Additional extensions to the TEE
secure boot [45], and a TPM [13] in tandem with a are also required in a minority of proposals. Notably, Liu et
normal operating system. We also categorise propos- al. [7] use an SRAM-based physically unclonable function
als that rely on DEP and W⊕X that make data portions (PUF) for deriving device-unique keys used in generating
of the target program non-executable [11, 40]. attestation reports. Irrespective of whether Intel SGX or
ARM TrustZone is used, the same underlying hardware is
The following assumptions are also commonly made. used to execute normal and secure world applications.
Attacks against are out of scope and it possesses Figure 9 shows the general architecture of TEE-based
unbounded resources. State-of-the-art CFA schemes do not CFA schemes using C-FLAT [4] as an example. The target
consider attacks against . For example, from malicious program is executed in REE, while the measurement and
insiders, due to supply chain attacks, or any other vector reporting procedures run within the TEE. In this model,
with the intention of undermining the verification (𝑉 ) and a kernel-based trampoline implements the functionality to
measurement procedures (). Moreover, it is assumed that jump between the normal and TEE worlds. When an instru-
has sufficient computational and storage resources to mented target control-flow transfer is executed, several steps
execute 𝑉 and, if applicable, when required without are followed:
hindrance.
acts honestly. It is similarly widespread that makes 1. The control-flow event is passed to the trampoline,
no malicious modifications to 𝑇 . deploys the intended which gains control of the execution.
program on along with any instrumentation (if applicable)
in an honest fashion. It is not considered in the interest of 2. The trampoline forwards the control-flow event to the
to instrument code that triggers malicious procedures, TEE over a secure monitor that validates the data and
issue malicious challenges, or otherwise disrupt after its performs the world switch to the TEE.
REE TEE
Reporting 5 Measurement
Target Program Module Module
1 9 6 4
2 8 7 3
Secure Monitor
TEE-enabled Hardware
Code Legend:
addi sp,sp,-16
pipelined processor
sw ra,12(sp)
jalr zero,ra,0 Pre-existing
lw ra,12(sp)
addi sp,sp,16
components
1 LO-FAT
executed instructions components
controller
branch type, (Src, Dest)
engine
Hash 11
Hash
Branches memory
engine Hash A
5
Indirect calls: (Src, Dest)
7 new_path ctrl
Loop Monitor
8 loop_end ctrl
6
Loop P_IDs 9 Loop P_IDs & counters
generator
Metadata
Metadata 10
Loop counter memory Metadata
storage
L
Figure 11: An overview of Atrium’s system architecture [37].
6. Instrumentation Methods Where 𝑐𝑝𝐴 and 𝑐𝑝𝐵 are checkpoint identifiers representing
The collection of control-flow data can be achieved an arbitrary sequence of BBLs. 𝐻(𝐿𝑜𝐴) is the hashed mea-
through the process of instrumentation, which involves mod- surement value of a list of actions (LoA), which is the list of
ifying the target program to add instructions that capture, CFG edges for traversing 𝑐𝑝𝐴 → 𝑐𝑝𝐵 ; and (𝐵𝐵𝐿𝑠𝑖 , 𝐵𝐵𝐿𝑑𝑖 )
and measure, control-flow events of interest. Dispatch in- is the source and destination BBL at index 𝑖. The compiler
structions are used near target control-flow transfers or at is also used for detecting and instrument control-flow in-
the entry of BBLs to this end. When a target control-flow stances, which are passed to three kernel-space modules re-
transfer is executed at run-time, the dispatch instructions sponsible for generating measurements. To support this, the
call pass the transfer(s) to the measurement module for it to Measurement Generator analyses the LLVM intermediate
be measured. Execution is resumed after the measurement representation (IR) of the target program’s processed source
process has completed. code. The CRAB framework is used to generate the CFG
Instrumentation has been achieved in three ways: 1 at itself, which is used to resolve indirect forward jumps. We
the compiler level [6, 10, 11, 13, 14, 43, 48, 49], 2 by note that mapping LLVM to raw assembly BBLs involves the
rewriting compiled binaries [4, 7, 38, 39, 41, 45, 46, 58, 62], use of dummy code, which is later removed using a custom
or 3 using dynamic instrumentation [8, 9, 40, 42]. Auto- binary rewriting tool. The authors add approximately 3.5K
mated methods are involved in all cases. In theory, manual lines of code (LOC) to CRAB and LLVM 5.0.
instrumentation is a fourth possibility—that is, identifying GuaranTEE [10] also uses the LLVM compiler frame-
and labelling control-flow transfers by hand—but this is work (see §4.3). The authors perform the instrumentation in
infeasible for non-trivial programs. We recognise that not two LLVM phases: during IR optimisation and the backend
all CFA proposals require a dedicated instrumentation step. stages. In the IR pass, calls are added to the trampoline
This applies where hardware extensions are used to detect, at the start and end of identified BBLs, i.e. at the start
capture and measure control-flow events within the CPU’s and end of BBLs and direct function calls. The backend
instruction pipeline (e.g. Atrium [37] and LO-FAT [12]). In pass implements trampoline calls before and after indirect
the coming sections, we describe methods for instrumenting function calls, indirect jumps, and before return instructions.
target programs used by CFA schemes. Notably, the authors released their LLVM extensions as
open-source software.
BLAST by Yadav & Ganapathy [14] is an efficient CFA code. Intel’s MPK system is used to protect the integrity
scheme that relies on a combination of techniques for min- of instrumented data structures—the loop stack and path
imising TEE world switches and the Ball-Larus algorithm stack—used by control-flow events. The authors compare
for optimal instrumentation placement in target program this to SCaRR [13] where control-flow edges are measured
CFGs [78]. The authors instrument BLAST as a compiler in kernel space, which has a “potential enormous context
pass within the LLVM framework in a way that commits switching cost for the control-flow events on an order of
CFG path information to a log file on . A TEE is used to magnitude in gigabytes” (p. 6, [58]). The authors combine a
sign a hash representation of the log file, which is returned kernel mode-based functions to preserve code integrity and
to on request. MPK-protect data structures in order to uphold the integrity
LAPE [48] uses the LLVM-compatible ACES [79] com- of the measurement procedure.
piler to create and instrument ‘compartments’ on ARMv7 DIAT [45], which attests software modules on drone
devices. Programs execute within components where indi- platforms, realises the instrumentation by making custom
rect calls and function entry and exit points are recorded and extensions to the PX4 open-source flight controller software.
measured using a hash-based approach. We note that, despite Firstly, each software module is assigned a unique identifier
LLVM’s flexibility, there are some downsides. LLVM is typ- for identifying the various flight controller sub-components
ically not used unchanged: proposals tend to require exten- at run-time. The authors rely on the insertion of dispatch in-
sive modifications or extensions of the magnitude of 1,000s structions at relevant control-flow events. These instructions
LOC [10, 13, 14]. In practice, this may pose compatibility redirect execution to a monitor that measures the ID of the
issues when adapting the schemes to existing toolchains and software module in question and the source and destination
development environments. addresses of the control-flow event. The method by which
In other work, Hu et al. [11] use a combination of the dispatch instructions are inserted is not clear; however,
compiler-assisted instrumentation and binary rewriting— it is stated that the “modifications are applied to the code in
discussed in §6.2—to hook control-flow instances. The assembly form” (p. 9, [45]).
GCC compiler with the -finstrument-functions flag is used Liu et al. [7] instrument the target program on at a
to instrument function entry and exit points. Afterwards, binary level before it is deployed. A custom binary analysis
branches within functions are hooked using binary rewriting tool is developed that targets indirect jumps and computes
using the Capstone disassembly framework [80]. and stores all legal targets in a database held by . The
work instruments the following instructions pertaining to
6.2. Binary Analysis and Rewriting control-flow events (ARM assembly): bx lr, pop pc rx,
Binary rewriting involves making direct modifications to blx rx, bl rx and bx rx. The instrumentation is used to
the compiled target program, where control-flow instances jump to a trampoline module, discussed in §5.1, which sends
are represented and changed in the form of machine code. control-flow data to a measurement module implemented
This approach can be applied to existing programs, making within ARM TrustZone, which includes the values of return
it suitable for incorporating CFA into software where the addresses and function pointers to be reported to .
original source code is not accessible. The approach has
particular applications to retrospectively protecting legacy 6.3. Dynamic Instrumentation
and proprietary software. Dynamic instrumentation refers to the method of insert-
C-FLAT [4] developed a custom binary rewriting ap- ing monitoring and verifying code into a program while it
proach for ARM binaries. The authors identify the locations is running. This compares to instrumenting target programs
of control-flow instructions, such as indirect branches, which using static analysis and manipulating the executable before
are overwritten with dispatch instructions. The dispatch execution (e.g. using binary rewriting). Dynamic methods
instructions are used to trigger an assembly routine, which rely on the insertion of probes or tracking mechanisms into a
serves as a trampoline for transitioning into ARM TrustZone running program. These probes can monitor the control flow
(see §5.1). The Capstone disassembly engine [80] is used of the application in real-time, identifying any deviations
to identify control-flow transfers in program binaries to be from expected paths.
rewritten with dispatch instructions. Common dynamic instrumentation tools used more gen-
Recording every function call can impose significant erally in the area of control-flow integrity use Intel Pin [82],
performance overhead (see p. 3, [58]). To address this, DynamoRIO [83], and LLVM-based methods [84]. In the
ReCFA [58] derives a CFG from the target program us- area of control-flow attestation specifically, GACFA [9] re-
ing TypeArmor for performing call-site filtering, whereby lies on Intel Pin-3.15 to instrument the target program during
CFG nodes are skipped where none of its predecessors has runtime. Here, control flow events, which are collected using
more than one successor. ReCFA uses Dyninst [81] for the Intel Pin, are sent to the scheme’s measurement module
static binary instrumentation of user-space programs on . running in an Intel SGX enclave. The measurements are cal-
Dyninst provides a set of libraries and utilities designed culated using a hash-based approach; the final measurement
for the static (and dynamic) inspection and manipulation is signed under a secret key within the enclave and provided
of executables. Dyninst enables users to insert, modify, or within an attestation report. A similar approach is used in
remove code in a program’s CFG without access to its source the BDFCFA [40] proposal, which uses Intel Pin for binary
instrumentation, and presents an approach with bidirectional learn all of the legitimate control-flow paths within an offline
attestation for blockchain applications. phase. This may be stored in a database in the form of <key,
RADIS [8] uses the Python trace module to dynami- value> pairs. (Using hash values to aggregate and represent
cally instrument target functions at the interpreter level. A the nodes traversed by is a common approach [4, 8, 13, 37,
bespoke implementation of the module in a TEE is recom- 39, 41, 45, 46, 48, 49]). In the online phase, issues an at-
mended to protect the measurement procedure from kernel- testation challenge, and responds with a measurement that
mode adversaries; however, the proposal was neither imple- matches against the records in its database. This method
mented nor evaluated. suffers from the path explosion problem, where the number
of control-flow paths grows dramatically with the number
7. Measuring and Monitoring Control Flow of branches in the target program. The storage requirements
placed on can be enormous. It is also possible that ’s
The measurement process is one of the key facets of CFG may contain non-deterministic behaviours that are not
a CFA scheme. It is at the core of demonstrating to straightforward to calculate a priori as a legitimate path,
that an authorised path/walk was followed through the target such as branches that depend on input/output (I/O) data.
program’s CFG. For a given CFA scheme, the measurement In comparison, monitoring and measuring events at a
method typically remains constant, regardless of the control- control-flow transfer level can mitigate the challenges above.
flow paths or the target program in question. However, these Rather than tracking all potential control-flow paths, only
methods tend to vary significant between CFA schemes, legitimate control-flow transfers are stored. For example, us-
which we elaborate upon in this section. ing a whitelist or otherwise flagging any anomalous transfers
The measurement process measures the whole or part(s) that are encountered. On the down side, this method can
of the target program’s CFG: impose significant run-time overhead on if many control-
• Whole program measurement: The whole control- flow transfers are monitored and measured during execution.
flow path used by the target is measured, comprising An interesting observation can thus be made: the choice
the nodes traversed by using a challenge supplied of measurement approach reflects a time-space complexity
by . The task of is to determine whether the mea- trade-off between the storage requirements of and the run-
surement returned by corresponds to a known, ‘safe’ time overhead on .
value reflecting the execution of the target program
7.1. Hash-based Measurements
from start to finish.
A naïve method is to instrument the target program on
• Partial measurement: Particular aspects of the control- in order to capture every control-flow transfer. When
flow path are measured, such as loops, certain branch- traverses the CFG of a target program, stores a list of these
ing instructions, functions etc. The overall measure- transfers and sends it to as part of an attestation response.
ment may comprise several such measurements as a However, such an approach can lead to prohibitively large
proxy that the target program had executed correctly attestation reports that may be unsuitable for constrained
and without the presence of an adversary. Given a devices. Many schemes, therefore, make use of hash-based
measurement result from , then checks whether measurements to represent control-flow paths in a space-
these (partial) program measurements conform to efficient manner.
known valid results.
7.1.1. Hash Computation
A CFA scheme may measure the whole program and Let us consider the above example once more. One
partial elements, or multiple aspects of partial elements, solution to reduce the memory requirements on is to
simultaneously. BLAST [14] serves as a good example, make a single hash calculation over the collected control-
which uses optimal instrumentation of loops and functions flow transfers and return this (hashed) result to after the
in order to achieve whole program control-flow attestation CFG’s terminating node had been reached. This is method
efficiently. is, again, not widely due to the need to store every control-
We can also consider measuring techniques as acting flow transfer before the hash function has been applied.
at different levels of abstraction. Path-based techniques To address this problem, hash-based CFA proposals have
measure an entire or partial control-flow path within the opted for a cumulative approach to hash computation. The
target program’s CFG, i.e. path-level granularity. In con- measurement module performs a hash computation at every
trast, control-flow transfer-based methods monitor individ- control-flow transfer using data from the previous transfer.
ual low-level transitions at each branch, function call, or an- This can be described by the recurrence relation in Eq. 2:
other relevant control-flow event. There is little information {
stored explicitly about the path. ℎ(𝑑, 𝐶𝑖 ) 𝑖=0
𝐻𝑖 = (2)
At first glance, it may seem that path-based measure- ℎ(𝐻𝑖−1 , 𝐶𝑖 ) 𝑖 > 1
ments are superior because they consider the entire path
explicitly. However, there can be a significant measurement Where 𝑖 denotes the node number, 𝐻𝑖 refers to the hash
range of potentially legitimate paths for non-trivial pro- value generated at node 𝑖, and ℎ(⋅) is a suitable hash function.
grams. Recall that, prior to ’s deployment, usually must 𝐶𝑖 is the control-flow event at 𝑖, which may be represented by
a (src, dst) pair [12], an (entry, exit) pair [13], or a BBL ID as new elements, i.e. control-flow transfers, are added. In-
[4]. 𝑑 represents the default value used when the first control- cremental MSH schemes are used to efficiently update new
flow event when 𝑖 = 0, e.g. 0 in C-FLAT [4] or ℎ(0) in control-flow transfers, avoiding the storage or recalculation
GACFA [9]. The main advantage of cumulative hash-based of the entire set upon adding a new element [87]. Compared
approach is that it does not require to store all control- to conventional cryptographic hash functions, MSH func-
flow transfers. Once a round of calculation is completed, tions are space-efficient for CFA applications, reducing the
the corresponding control-flow transfers can be discarded. In number of possible hash values to be stored in a database
Figure 16, we present a canonical example of a cumulative belonging to or .
hash-based approach used by C-FLAT [4], demonstrating The space-efficient property of MSH functions can also
the hash values at different nodes of a target program’s CFG. pose some downsides. Namely, they do not consider the
order of control-flow transfers in the set. It is possible that
multiple legitimate control-flow paths have the same hash
value, thus loses contextual information control-flow path.
Even if a hash value in the measurement result matches one
in the database, can determine only the number, not the
order, of executed control-flow transfers. This could be fatal
if an adversary finds a sequence of malicious control-flow
transfers that produces the same MSH value as the legitimate
sequence. We pose this issue as a research challenge.
7.1.2. Discussion
The main advantage of hash-based measurement is that
it can generate short, constant-length measurement results,
irrespective of the target program’s size or the number of
Figure 16: An example of cumulative hash calculation approach control-flow events. The length of the measurement result
from C-FLAT [4]
is a major factor to influence the attestation report size, and
thus the space and network complexity when the report is
Common hash functions used in existing CFA proposals computed on and sent to .
can be categorised as standard cryptographic hash functions, However, hash-based measurements present some major
and multi-set hash functions. disadvantages. The first is the aforementioned path explo-
Standard cryptographic hash functions: BLAKE2 [85] sion problem: if a target program contains several loops
is a common choice for CFA schemes targeted at embedded with non-trivial terminating conditions, then it is possible
systems with limited computational resources [6, 13, 35, 43]. that relatively short programs can impose significant storage
BLAKE2 is known to be significantly outperform (∼50%) requirements on . Furthermore, if an attack does take
competing functions, such as Keccak and SHA-2, in raw place, then cannot determine how or where any violations
execution time (cycles per byte) [86]. We note that SHA- occurred in the control-flow path followed by the target
256 [8, 9], SHA-512 [41] and SHA-3 [12, 34] have also program from the hash measurement alone [7]. The presence
been employed in existing work, albeit to a lesser extent. of adversarial behaviour can be detected, but its precise
Worryingly, cryptographically broken algorithms have also nature cannot, which may pose difficulties in identifying
been investigated by some schemes (MD5 [41, 48]). and patching any security vulnerabilities. This is due to
Incremental Multi-set hash (MSH) functions: Some CFA the inherent pre-image resistance and avalanching effects
schemes have explored specialised hash functions with of cryptographic hash functions: using just one additional
space-efficient properties. Incremental MSH functions [87] control-flow transfer in the hash computation yields a sig-
take as a collection, or multiset, of arbitrary elements and nificant difference in the resulting hash value.
their occurrences. An MSH function allows one to incremen- A segmented hashing approach has been used to reduce,
tally generate different permutations of the multiset while but not eliminate, the effects of path explosion. Here, the
enabling the testing of equivalence relations. For example, target program is divided into multiple segments, and a
consider the multiset M = {⟨𝐴, 3⟩, ⟨𝐵, 2⟩, ⟨𝐶, 8⟩, ⟨𝐷, 1⟩}, separate hash computation is performed for each segment.
where ⟨𝐴, 3⟩ denotes the element 𝐴 and 3 represents its Atrium [37] considers each loop in the target program as a
number of occurrences (or multiplicity). Consider an MSH separate, independent segment. Let us consider a toy target
value constructed from this multiset where 𝐷 is added to program containing five sequential loops, each of which
an existing ordered set, {𝐴, 𝐵, 𝐶}, i.e. 𝐻1 = {𝐴, 𝐵, 𝐶} ∪ may iterate up to 10 times. A separate hash computation is
𝐷 = {𝐴, 𝐵, 𝐶, 𝐷}. Consider another MSH value, 𝐻2 = performed for each loop. will generate six hash values to
{𝐷, 𝐵, 𝐴, 𝐶}. An MSH scheme enables one to verify the represent the entire control-flow path, one for each loop, plus
equivalence of 𝐻1 and 𝐻2 . one hash value to represent a control-flow path excluding all
MSH schemes are used by DIAT [45] and CHASE [35]. loops. Consequently, must only store 51 legitimate hash
They allow the hash value of the set to be efficiently updated values (10 for each of the five loops) and one legitimate for
the control-flow path excluding loops, signfiicantly reducing Another example is RAGE [42], which relies on a ma-
the storage requirements versus a conventional hash-based chine learning-based approach using unsupervised graph
approach (resulting in storing 100,000 control-flow paths). neural networks to avoid computing a complete CFG of the
Segmented hash computation has also been used to tackle target program. In the training phase, uses one legitimate
conditional branches and recursive functions [13]. control-flow path represented as a sequence of BBL ad-
dresses as a partial CFG (PCFG) with which feature extrac-
7.2. Other Path Log Representations tion is performed. The features are used to train a Variational
Rather than summarising the control-flow path using Graph Autoencoder (VGAE) model [89] that encodes CFGs
hash values, other proposals have used more sophisticated into a lower-dimensional space (graph embedding). The
methods for logging and representation. ISC-FLAT by Neto legitimate control-flow path is converted to an embedding
& De Oliveira [38] aims to address the problems caused by which serves as a reference. Ten other legitimate and illicit
system interrupts during the measuring process. Consider a control-flow paths are used to set an appropriate threshold
simplified example in which traverses the nodes 𝑁𝐴 → using the directed Hausdorff distance [90] to quantify the
𝑁𝐵 → 𝑁𝐶 to produce a hash value 𝐻(𝑁𝐴 , 𝑁𝐵 , 𝑁𝐶 ) as difference between two embeddings. During the attestation
part of a legitimate control-flow path run on . In realistic process, sends a measurement result, a sequence of BBL
scenarios, a software or hardware interrupt may sponta- addresses, executed by the target program to . inputs this
neously cause another sequence of nodes to be visited mid- data to the VGAE model to produce a new embedding.
measurement, i.e. 𝑁𝐴 → 𝑁𝐵 → (𝑁𝐸 → 𝑁𝐹 → 𝑁𝐺 ) → compares the embedding with the referenced embedding;
𝑁𝐶 , where (⋅) contains the nodes visited by an interrupt if their distance exceeds a predefined threshold, then it
handler. Hashing these nodes would produce an unintended indicates the detection of attacks. The authors tested their
hash value, undermining the reliability of the measurement proposal using synthetic ROP and DOP attacks, yielding
process.6 results of 0.9803 (ROP) and 0.9101 (DOP) F1-score while
ISC-FLAT addresses this problem by using a TEE-based maintaining a low False Positive Rate of 0.0319.
application to monitor the control-flow transfers of an in-
strumented target program and trapping all interrupt requests 7.3. Whitelist-based Monitoring
within the TEE. The TEE is used to protect the measuring In whitelist-based CFA schemes, generates an enu-
process from kernel-mode adversaries. When an interrupt is meration of all authorised control-flow transfers in an offline
trapped, the TEE application notes the points at which the phase. In the online phase, sends a sequence of target
interrupt service routine was activated, which is reflected control-flow transfers taken by the target program to ,
in the report sent to . The system stores path logs of the represented by (src, dst) pairs. matches each (src, dst) pair
form {𝐴𝑖0 , 𝐴𝑒0 , 𝐴𝑖1 , 𝐴𝑒1 , … , 𝐴𝑖𝑛 , 𝐴𝑒𝑛 }, where 𝐴𝑖𝑥 represents the against the whitelist; any mismatch indicates the occurrence
memory address of the first instruction of node 𝑁𝑥 in the of a control-flow attack. The approach has been used by a
program’s control flow graph, and 𝐴𝑒𝑥 is the destination of a minority of proposals [7, 58].
branching instruction in 𝑁𝑥 . In other words, a sequence of An advantage of whitelist-based measurement is that
addresses and destinations is used to represent the control has greater visibility over the nature of the control-flow
flow, rather than aggregating them as a single hash value. violation, which can help with patching and remediation
Control-flow abuse is then detected through the inspection efforts. Another advantage is that it demands less storage
of the path log that is sent from to . space on than hash-based approaches. still requires a
BLAST [14] uses Ball-Larus numbering to assign a nu- whitelist to store all legitimate control-flow transfers of the
meric value to each edge of the target program’s CFG. Every target programs, but the path explosion problem is avoided;
possible path is marked by a unique integrity identifier,called the storage requirement is proportional to the number of
pathID, which is the sum of the values of the edges included legitimate control-flow transfers.
in the path. When a function call has finished execution, its A concern with whitelist approaches is the need to mon-
pathID, target address of the function call, and return in- itor, and potentially store, control-flow transfers in real-time.
struction are recorded and stored in shared memory between This can impose significant time and space requirements on
the REE and TEE. When a predetermined length has been . Indeed, if the attestation report is sent to only after the
reached, the log is sent to measurement module within the target program terminates, the measurement module must
to perform a cumulative hash computation. BLAST can send store the sequence in memory until that point. Indeed, a
two different measurement results. By default, sends a long sequence of control-flow transfers produces a large
single hash value as a measurement result to . If cannot attestation report size.
match the value in its database, it requests that provides One way to mitigate the problem caused long control-
a sequence of tuples <func, pathID>. then computes the flow transfer sequences is using partial attestation reports.
Whole Program Path (WPP) representation [88] from the Here, generates a mini-attestation report when specific
sequence and sends it to . requirements are meet—e.g. the configured length of the
6 A naïve solution is to suspend interrupts altogether during the mea-
sequence is reached [34], and particular piece of code fin-
surement process, but this can give rise to wider system issues. ishes [6]—in order to prevent memory exhaustion. Using
partial attestation reports allows to receive information
typically execute a large number of return instructions, • - paradigm: We give the prover-verifier (-)
i.e. backward control-flow transfers. Hashing is naturally paradigm used by the proposal; for example, whether
space-efficient over listing them sequentially, significantly it uses a conventional remote attestation, hybrid, or
reducing the size of the measurement result. Indeed, OAT another approach described in §4.
reported an average size reduction of 97% [6].
A second advantage is that it ameliorates a major draw- • Target control-flow transfers: The types of control-
back of hash-based measurements: the requirement for to flow transfers that are monitored by the proposal using
store all possible control-flow paths and their corresponding those discussed in §2.2.
hash values. Return addresses, which serve as input of hash • Target control-flow attacks: The types of control-
computation, are obtained by abstract execution based on flow attacks addressed by the scheme, which were
memory addresses of function call. As such, does not have discussed in §2.1. We further annotate this column as
to know learn possible backward control-flow paths during follows: a ‘MP’ label is used where the proposal as-
the offline phase, but instead only the final value. sumes a memory protection mechanism to be in place,
However, hybrid methods still inherit a significant disad- such as data execution prevention or non-writable
vantage of hash-based approaches. Specifically, if the hash code regions. ‡ indicates that the proposal monitors
value from the measurement result does not match the one only specific sections of the target program, not the
obtained from abstract execution, then does not know program in its entirety. ± is used when the proposal
what backward control-flow path was violated. This limits uses a machine learning or statistical model to detect
’s ability to precisely identify potential vulnerabilities that control-flow attacks; that is, whether an attack is de-
exploit return addresses (e.g. as used by ROP attacks). tected is probabilistic, not a certainty.
Table 2
A summary of CFA proposals using the comparison criteria from §8.1.
Proposal Year Trust Anchor Measurement Instrumentation FI? - Target Control- Control-Flow
Approach Method Paradigm Flow Transfers Attacks
C-FLAT [4] 2016 TEE (ARM Trust- Hash-based Binary Rewriting ✓ Conventional (C.1) (A.1)(A.2)
Zone) (A.3)¶
LO-FAT [12] 2017 CPU Extensions Hybrid ▴ ✓ Conventional (C.1) (A.1)(MP)
(A.2)(A.3)¶
ATRIUM [37] 2017 CPU Extensions Hash-based ▴ ✓ Conventional (C.1) (A.1)(A.2)†
(A.3)¶
LiteHAX [34] 2018 CPU Extensions Abstract Ex- ▴ ✓ Continuous (C.1) (A.1)(A.2)†
ecution Reporting (A.3)(A.4)
CHASE [35] 2019 CPU Extensions Hash-based ▴ ✓ Conventional (C.8)† (A.1)(A.2)
(MSH) (A.3)¶
DIAT [45] 2019 TEE* Hash-based Binary Rewriting † ✗ Collective (C.8) (A.1)‡(A.2)‡
(MSH) (A.3)‡¶
(A.4)‡¶
G&R [47] 2019 Hardware Module ▴ ▴ ✓ Conventional (C.2)(C.3)(C.4) (A.1)(MP)
(C.7) (A.2)(A.3)(A.4)
Liu et al. [7] 2019 TEE (ARM Trust- Whitelist Binary Rewriting ✓ Conventional (C.2)(C.3)(C.5) (A.1)(A.2)
Zone) + PUF (C.6)(C.7) (A.3)¶
MGC-FA [11] 2019 TEE (ARM Trust- Hash-based Compiler (GCC ✓ Conventional (C.8) (A.1)±(A.2)±†
Zone) and Capstone) (A.3)±¶
RADIS [8] 2019 TEE* Hash-based Dynamic† ✗ Collective (C.1) (A.1)(A.2)†
ScaRR [13] 2019 Kernel Hash-based Compiler (LLVM) ✓ Conventional (C.2)(C.3)(C.5) (A.1)(A.2)
(C.6)(C.7)
CFPA [60] 2019 TEE* ★ ★ ✗ Conventional ★ (A.1)(A.2)
(A.3)¶
DO-RA [46] 2020 TEE (ARM Trust- Hash-based Binary Rewriting ✓ Conventional (C.2)(C.5)(C.6) (A.1)(MP)
Zone) (C.7) (A.2)† (A.3)
LAPE [48] 2020 MPU Hash-based Compiler (LLVM) ✓ Conventional (C.3)(C.6) (A.1)(A.2)
OAT [6] 2020 TEE (ARM Trust- Hybrid Compiler (LLVM) ✓ Hybrid (C.2)(C.5)(C.6) (A.1)‡(MP)
Zone) (C.7) (A.2)‡† (A.3)‡
(A.4)
ARCADIS [41] 2021 ROM Hash-based Binary Rewriting † ✗ Collective (C.1) (A.1)(A.2)
GACFA [9] 2021 TEE (Intel SGX) Hash-based Dynamic (Intel ✓ Conventional (C.8) (A.1)±(A.2)±
Pin) (A.3)±¶
ReCFA [58] 2021 Kernel Whitelist Binary Rewriting ✓ Conventional (C.2)(C.5)(C.6) (A.1)(MP)
(C.7)(C.3)¶ (A.2)†
Tiny-CFA [62] 2021 Hardware Module ▴ Binary Rewriting ✓ Conventional (C.1) (A.1)(A.2)¶
(A.3)
Papamartzivanos 2021 Trace Module (In- ▴ ▴ ✓ Conventional (C.2)†(C.4)(C.7)† (A.1)(A.2)
et al. [68] tel PT)
BDFCFA [40] 2022 TEE (Intel SGX) Hash-based Dynamic (Intel ✓ Conventional (C.1) (A.1)(MP)
Pin) (A.2)¶ (A.3)¶
Ben Yehuda et 2022 Kernel Hash-based Binary Rewriting ✓ Conventional (C.1) (A.1)(A.2)
al. [39] (A.3)¶
CFRV [49] 2022 TEE (ARM Trust- Hash-based Compiler (LLVM) ✓ Collective (C.1) (A.1)(A.2)¶
Zone)
GuaranTEE [10] 2022 TEE (Intel SGX) Abstract Ex- Compiler (LLVM) ✓ Hybrid (C.4)(C.5)(C.6) (A.1)(A.2)¶
ecution (C.7)
ACFA [44] 2023 Hardware Module Abstract Ex- ▴ ✓ Continuous (C.1) (A.1)(A.2)
(On-chip) ecution Reporting
ARI [43] 2023 TEE (ARM Trust- Abstract Ex- Compiler (LLVM) ✓ Conventional (C.8) (A.1)(A.2)(A.3)
Zone) ecution
BLAST [14] 2023 TEE (ARM Trust- Path Log Compiler (LLVM) ✓ Conventional (C.2)(C.3)(C.5) (A.1)(MP)
Zone) (C.6)(C.7) (A.2)¶†
ISC-FLAT [38] 2023 TEE (ARM Trust- Path Log Binary Rewriting ✓ Conventional (C.1) (A.1)¶ (A.2)¶
Zone)
ZEKRA [36] 2023 Trace Module* Hash-based ▴* ✗ VC (C.1) (A.1)(A.2)
RAGE [42] 2024 TEE* Path Log Dynamic ✗ Conventional (C.1) (A.2)±¶
(A.4)±¶
LightFAt [5] 2024 PMU Path Log Dynamic ✓ Conventional (C.1) (A.1)±(A.2)±
*: Recommended for a complete implementation but not evaluated. ▴: Hardware detection circuit. †: Suspected based on publicly available
information. ¶: Partially addresses. ★: This work sketches a property-based CFA scheme and high-level requirements; it is neither implemented
nor evaluated. ‡: Uses a probabilistic approach. ±: Applies to only portions of the target program.
modify program code at run-time (marked as rx)” (p. 3, [12]) performing the challenge at launch-time, for example, would
as an implicit requirement for a memory protection mecha- be insufficient in detecting transient attackers.
nism. Geden & Rasmussen [47] requires non-writable code A final point of note is the relative lack of focus on direct
regions and non-executable data regions. In a TEE-based data manipulation (DDM) and data-oriented programming
scheme, OAT [6] assumes that the attackers cannot tamper (DOP) attacks: (A.3) and (A.4) respectively. 16/31 (≈52%)
with the instrumented code or the trampoline library in the and only 5/31 (≈16%) proposals address (A.3) and (A.4),
normal world, citing some memory protection methods for again respectively. We have mentioned that DOP attacks
embedded devices as potential solutions [93, 94].7 represent a highly sophisticated attacker (see §2.1 and [34]).
We observe that ROP attacks are the primary threat in The principal focus of CFA schemes is on direct control flow
schemes that address code reuse attacks, (A.2). Such pro- attacks, (A.1) and (A.2), rather than data flow attacks, even
posals add detection mechanisms to control-flow transfers though the latter can influence the former indirectly. Indeed,
that return from functions. Some work, e.g. C-FLAT [4] some schemes explicitly declare data-flow attacks to be out-
and GuaranTEE [10], address a particular variant in which a of-scope, even though their potential effects on control-flow
function is invoked from multiple calling locations. Suppose are recognised [4, 60]. Preventing data-oriented attacks is
there are two legitimate control-flow paths, 𝐴 → 𝐵 → 𝐶 a burgeoning area of research [96], and it is our belief that
and 𝐷 → 𝐵 → 𝐸, where 𝐵 is a vulnerable function. An designing attestation schemes that detect control- and data-
attacker may perform an attack to alter the control-flow path flow attacks is an important direction going forwards.
to 𝐷 → 𝐵 → 𝐶, which still visits legitimate CFG nodes
and may not be detected as an illicit path. This has been
addressed through the use of shadow stacks [7, 13, 47, 58]. 9. Open Problems and Recommendations
When a call-based control-flow transfer is executed, if its This section discusses research gaps and makes sugges-
target address is valid, then its call-after point will be pushed tions and recommendations for future work in the field.
on this shadow stack. For a return-based transfer, if its target
address is valid based on CFG, then checks if it is identical 9.1. Platform Dependencies
to the top element of the shadow stack. If it is not, a control- We observed that many CFA schemes rely on proprietary
flow hijacking attack would be detected. If it matches, the platform features, such as Intel PT for measuring control-
top element is popped from the stack. Call-return matching flow traces, and Arm TrustZone for protecting code integrity.
is another solution in which marks legitimate pairs of call This can negatively affect the potential longevity of propos-
and return control-flow transfers within the scheme’s offline als if certain features are made obsolete by platform vendors.
phase, which are measured at run-time [4]. We note, for example, that Intel SGX was recently discon-
All CFA proposals claimed that code reuse attacks can be tinued for consumer devices (Intel Core CPUs), limiting the
detected to some extent; however, only 11/31 (≈35%) specif- applicability of relevant CFA schemes that depend on it. It is
ically mentioned call-return matching or shadow stacks as possible that analogous technologies may be present on other
well-known countermeasures [4, 6, 7, 13, 34, 35, 36, 38, platforms, e.g. Arm TrustZone over Intel SGX, and that the
43, 47, 58]. In some instances, details about the call-return main idea of the scheme can still be applied. Nevertheless,
matching implementation have not been discussed fully [34]. not every TEE offers the same security properties. We urge
The propensity to focus on ROP attacks is notable. In recent caution when translating schemes to other platforms and the
years, other code reuse attacks have been developed, such impact this has on reducing the CFA scheme’s security.
as COP and JOP attacks, while only ROP attacks have
been demonstrated. In future work, we urge that researchers ⋄ Recommendation: We recommend that authors re-
demonstrate a wider range of code reuse attacks, such as lying on platform-specific security features for their
COP, JOP, return-to-libc, and ROP attacks. CFA schemes offer alternatives for those features on
In a related challenge, many proposals attest the program other platforms. Alternatively, authors ought to con-
at launch-time, which can give rise to TOCTOU attacks if sider generic constructions that are agnostic to partic-
an adversary conducts an attack before or after an attes- ular platform features.
tation challenge. Continuous reporting proposals explicitly
aim to address this challenge [34, 44] (see §4.2), where 9.2. Limited Deployment Scenarios
actively actively sends measurements to . However, this During the course of this work, it became evident that
paradigm is used by a very small number of proposals (2/31, CFA proposals tend to evaluated in laboratory conditions or
≈6%). Conventional attestation schemes may still detect this using fixed datasets, e.g. SPEC CPU benchmarks, that do
by storing measurements until they are requested by not replicate many real-world scenarios. In particular, we
(see ARCADIS [41]). Yet, this relies on determining recognised that CFA schemes do not consider long-running
an appropriate timeframe to issue the challenge. Simply applications, like those that control sensors and actuators.
Such programs may execute a large number of control-flow
7 We note that, in some cases, it is possible to bypass memory protection
transfers during run-time. For schemes that measure control-
mechanisms, e.g. DEP, using ROP-style attacks. To this end, we refer the flow transfers, might receive large attestation reports; in a
reader to work by Göktas et al. [95].
path-based scheme, may find it challenging, if not infea-
sible, to obtain all legitimate control-flow paths due to the
enormous state space. We observed that very few proposals and adversarial models. We note that evaluations tend to
(e.g. OAT [6]) support customisable code attestation, which be conducted in an ad hoc manner, employing different
could alleviate this problem. test programs, metrics, and baselines. Consequently, it is
A large number of proposals rely on the submission difficult to evaluate CFA schemes in order to compare their
of program inputs in order to trigger a measurement of relative effectiveness. Some proposals have evalauted com-
traversed CFG notes. This paradigm may cause issues in ap- mon programs. In particular, the open syringe pump is the
plications where the program inputs do not traverse affected most common, which is an embedded program for remotely
nodes, e.g. affected by a code injection attack. For example, controlling liquid injections in medical applications. This
consider a device that accepts a significant number of inputs, has been used by seven CFA proposals [4, 6, 12, 34, 38,
such as a robotic arm or another cyber-physical system () 43, 62] that are designed for embedded devices. Using more
that communicates with a remote control service (). What general benchmarks is another choice in evaluation. SNU
parameters should submit to ? Is it possible that the sup- Real-Time Benchmark is the most common one evaluated
plied parameters could unintentionally miss malicious code by three proposals [9, 11, 40]. Some proposals [14, 36, 42]
within a larger complex program? These questions, among published recently opt for Embench-IOT benchmark suite
others, have received limited attention in the literature. [97]. However, they are not designed for CFA schemes.
It is also conceivable that relies on sensitive data that Besides performance evaluations, the lack of uniform
cannot reasonably access, such as personally identifiable test vectors for evaluating security is also a notable omission
information or security-sensitive sensor data. Unfortunately, in the literature. CRFV [49] uses the RIPE [98] benchmark
we found that authors rarely discuss application- or domain- to evaluate the capability of code injection attacks and ROP
specific implications, even though it may impact the deploy- attack detection. Some proposals (e.g., C-FLAT [4]) involve
ability of a CFA scheme in practice. manual attacks that redirect the target program’s control flow
in the evaluation. Chilese et al. [42] evaluate their proposals
⋄ Recommendation: We advise that authors describe through a synthetic attack trace generation approach. They
the deployment assumptions and some challenges that develop generators that modify legitimate control-flow paths
may arise therein under realistic scenarios. In con- to compromised ones under ROP and DOP attacks to simu-
junction with the recommendation in §9.4, it may be late the two attacks.
useful to develop evaluation approaches and datasets
that reflect real-world problems in common domains ★ Research Question: To what extent can a benchmark-
of interest, e.g. cyber-physical systems. ing suite be developed for CFA schemes that evaluates
different control-flow attacks in a uniform manner?
9.3. Closed-source Implementations
We also became aware that most CFA proposals are pre- 9.5. Addressing Physical Attacks
sented without any publicly available artefacts. This is par- Many papers deem physical attacks on the devices as
ticularly crucial when a scheme demands extensive low-level outside the scope of their threat models [8, 47]. This is
changes, such as custom CPU extensions as FPGA/HDL acceptable where we can safely assume that adversaries
modules, binary-level modifications, custom kernel mod- are strictly remote or software-based ones. However, many
ules, and new assembly routines. A severe cost can be CFA schemes are designed for embedded systems, which
imposed to students and researchers who may wish to rapidly may be deployed in largely unsupervised environments,
prototype or compare multiple schemes. We also observe such as industrial control systems and critical national in-
that key implementation aspects are sometimes left unde- frastructure. The value of such devices to adversaries may
fined, such as the method of instrumentation, and some bring increase the viability of physical attacks, such as fault
schemes are not fully implemented (6/31, ≈20%). injection and side-channel attacks. Indeed, the authors of
Ultimately, this may lead to unfaithful implementations OAT state: “We trust code running inside the Secure World
and evaluations, impeding the reproducibility of some pro- (e.g., the measurement engine) and assume that attackers
posals. Additionally, a few proposals with closed-source cannot break the TrustZone protection. We also trust our
implementations do not describe the types of target control- compiler and the trampoline code. We assume that attackers
flow transfers or how performs CFI checks, which in- cannot inject code in the Normal World or tamper with the
creases the difficulty for us to identify which specific types instrumented code or the trampoline library” [6]. It is widely
of control-flow attacks they can detect. known that TEEs are susceptible to hardware attacks [50].
We pose this as a significant gap in the literature, and urge
⋄ Recommendation: We strongly recommend that au- the development of CFA schemes that consider such threats.
thors open-source CFA schemes in the pursuit of
open science and reproducibility. Schemes that rely on ★ Research Question: To what extent can existing CFA
hardware extensions and low-level software modifica- schemes be made explicitly resilient to physical hard-
tions are of particular importance. ware attacks, such as fault injections?
9.4. On the Need for Common Benchmarks 9.6. Extended Attestation Paradigms
A wide variety of CFA schemes have been proposed in The vast majority of CFA schemes follow a traditional
the scientific literature with a myriad of system architectures remote attestation with a single has a rich, longstanding
history dating back to the inception of the Trusted Plat- [10] M. Morbitzer, B. Kopf, and P. Zieris, “GuaranTEE: Introducing
form Modules (TPM) in the early-2000s. We recognise control-flow attestation for trusted execution environments,” in 16th
that some different approaches to attestation, which were International Conference on Cloud Computing, IEEE, 2023.
[11] J. Hu, D. Huo, M. Wang, Y. Wang, Y. Zhang, and Y. Li, “A
developed for TPM-based platforms, have not yet been ap- probability prediction based mutable control-flow attestation scheme
plied to the domain of control-flow attestation. In particular, on embedded platforms,” in 18th IEEE Int’l Conference on Trust,
property-based [61], direct anonymous [99], and mutual Security and Privacy in Computing and Communications, IEEE,
attestation [100] have not been fully adapted to CFA. We 2019.
believe that these areas are fruitful avenues for researchers [12] G. Dessouky, S. Zeitouni, T. Nyman, A. Paverd, L. Davi, P. Koeberl,
N. Asokan, and A.-R. Sadeghi, “LO-FAT: Low-overhead control
in the field. flow attestation in hardware,” in 54th Annual Design Automation
Conference, 2017.
★ Research Question: To what extent can other at- [13] F. Toffalini, E. Losiouk, A. Biondo, J. Zhou, and M. Conti, “ScaRR:
testation paradigms, such as PBA, DAA and mutual Scalable runtime remote attestation for complex systems,” in 22nd
attestation, be integrated into control-flow attestation? Int’l Symposium on Research in Attacks, Intrusions and Defenses,
2019.
[14] N. Yadav and V. Ganapathy, “Whole-program control-flow path
10. Conclusion attestation,” in ACM SIGSAC Conference on Computer and Com-
munications Security, 2023.
In this paper, we presented a comprehensive analysis of [15] R. V. Steiner and E. Lupu, “Attestation in wireless sensor networks:
control-flow attestation as it is currently realised in the state A survey,” ACM Computing Surveys, vol. 49, no. 3, pp. 1–31, 2016.
of the art, exploring the concepts, solutions, techniques and [16] O. Arias, F. Rahman, M. Tehranipoor, and Y. Jin, “Device attes-
tation: Past, present, and future,” in Design, Automation & Test in
assumptions of 31 proposals in the field. We categorised Europe Conference & Exhibition (DATE), pp. 473–478, IEEE, 2018.
existing CFA proposals based on their key features using [17] I. Sfyrakis and T. Gross, “A survey on hardware approaches
a common set of criteria, including their instrumentation for remote attestation in network infrastructures,” arXiv preprint
method, target control-flow transfers, attacks addressed, at- arXiv:2005.12453, 2020.
testation paradigm, and more. Based on this, we presented a [18] W. A. Johnson, S. Ghafoor, and S. Prowell, “A taxonomy and review
of remote attestation schemes in embedded systems,” IEEE Access,
series of open challenges and recommendations for steering vol. 9, pp. 142390–142410, 2021.
future research in the field. Through our work, we hope it can [19] S. F. J. J. Ankergård, E. Dushku, and N. Dragoni, “State-of-the-art
enhance the reader’s understanding of the capabilities and software-based remote attestation: Opportunities and open issues for
considerations involved in designing and analysing control- Internet of Things,” Sensors, vol. 21, no. 5, p. 1598, 2021.
flow attestation schemes, and the challenges that lie ahead. [20] B. Kuang, A. Fu, W. Susilo, S. Yu, and Y. Gao, “A survey of
remote attestation in internet of things: Attacks, countermeasures,
and prospects,” Computers & Security, vol. 112, p. 102498, 2022.
[21] M. Ambrosin, M. Conti, R. Lazzeretti, M. M. Rabbani, and
References S. Ranise, “Collective remote attestation at the internet of things
[1] M. Abadi, M. Budiu, U. Erlingsson, and J. Ligatti, “Control-flow in- scale: State-of-the-art and future challenges,” IEEE Communica-
tegrity principles, implementations, and applications,” ACM Trans- tions Surveys & Tutorials, vol. 22, no. 4, pp. 2447–2461, 2020.
actions on Information and System Security, vol. 13, 2009. [22] N. Burow, S. A. Carr, J. Nash, P. Larsen, M. Franz, S. Brunthaler,
[2] V. Kuznetzov, L. Szekeres, M. Payer, G. Candea, R. Sekar, and and M. Payer, “Control-flow integrity: Precision, security, and per-
D. Song, “Code-pointer integrity,” in The Continuing Arms Race: formance,” ACM Computing Surveys, vol. 50, no. 1, 2017.
Code-Reuse Attacks and Defenses, pp. 81–116, ACM, 2018. [23] R. De Clercq and I. Verbauwhede, “A survey of hardware-based
[3] T. Abera, N. Asokan, L. Davi, F. Koushanfar, A. Paverd, A.-R. control flow integrity (CFI),” arXiv preprint arXiv:1706.07257,
Sadeghi, and G. Tsudik, “Things, trouble, trust: on building trust in 2017.
iot systems,” in 53rd Annual Design Automation Conference, 2016. [24] P. Muntean, M. Neumayer, Z. Lin, G. Tan, J. Grossklags, and
[4] T. Abera, N. Asokan, L. Davi, J.-E. Ekberg, T. Nyman, A. Paverd, C. Eckert, “Analyzing control flow integrity with LLVM-CFI,” in
A.-R. Sadeghi, and G. Tsudik, “C-FLAT: Control-flow attestation 35th Annual Computer Security Applications Conference, 2019.
for embedded systems software,” in ACM SIGSAC Conference on [25] S. Chen, J. Xu, E. C. Sezer, P. Gauriar, and R. K. Iyer, “Non-control-
Computer and Communications Security, 2016. data attacks are realistic threats,” in USENIX Security Symposium,
[5] J. Gonzalez-Gomez, H. Nassar, L. Bauer, and J. Henkel, “Light- vol. 5, p. 146, 2005.
FAt: Mitigating control-flow explosion via lightweight PMU-based [26] H. Shacham, “The geometry of innocent flesh on the bone: Return-
control-flow attestation,” in IEEE International Symposium on into-libc without function calls (on the x86),” in 14th ACM Confer-
Hardware Oriented Security and Trust, pp. 222–226, IEEE, 2024. ence on Computer and Communications Cecurity, 2007.
[6] Z. Sun, B. Feng, L. Lu, and S. Jha, “OAT: Attesting operation [27] A. Homescu, M. Stewart, P. Larsen, S. Brunthaler, and M. Franz,
integrity of embedded devices,” in IEEE Symposium on Security and “Microgadgets: Size does matter in Turing-complete return-oriented
Privacy, pp. 1433–1449, IEEE, 2020. programming,” Workshop on Offensive Technologies, 2012.
[7] J. Liu, Q. Yu, W. Liu, S. Zhao, D. Feng, and W. Luo, “Log-based [28] T. Bletsch, X. Jiang, V. W. Freeh, and Z. Liang, “Jump-oriented pro-
control flow attestation for embedded devices,” in 11th Interna- gramming: a new class of code-reuse attack,” in 6th ACM Symposium
tional Symposium on Cyberspace Safety and Security, pp. 117–132, on Information, Computer and Communications Security, 2011.
Springer, 2019. [29] N. Carlini and D. Wagner, “ROP is still dangerous: Breaking modern
[8] M. Conti, E. Dushku, and L. V. Mancini, “RADIS: Remote attesta- defenses,” in 23rd USENIX Security Symposium, pp. 385–399, 2014.
tion of distributed IoT services,” in 6th International Conference on [30] A. Sadeghi, S. Niksefat, and M. Rostamipour, “Pure-call oriented
Software Defined Systems, pp. 25–32, IEEE, 2019. programming (pcop): chaining the gadgets using call instructions,”
[9] J. Zhan, Y. Li, Y. Liu, H. Li, S. Zhang, and L. Lin, “NSGA-II- Journal of Computer Virology and Hacking Techniques, 2018.
based granularity-adaptive control-flow attestation,” Security and [31] S. Chen, J. Xu, E. C. Sezer, P. Gauriar, and R. K. Iyer, “Non-control-
Communication Networks, vol. 2021, pp. 1–16, 2021. data attacks are realistic threats,” in USENIX Security Symposium,
[72] C. Maurice, N. Le Scouarnec, C. Neumann, O. Heen, and A. Francil- [94] C. H. Kim, T. Kim, H. Choi, Z. Gu, B. Lee, X. Zhang, and D. Xu,
lon, “Reverse engineering Intel last-level cache complex addressing “Securing real-time microcontroller systems through customized
using performance counters,” in 18th Int’l Symposium on Research memory view switching.,” in Network and Distributed System Se-
in Attacks, Intrusions, and Defenses, Springer, 2015. curity Symposium, 2018.
[73] B. Zhou, A. Gupta, R. Jahanshahi, M. Egele, and A. Joshi, “Hard- [95] E. Göktas, E. Athanasopoulos, H. Bos, and G. Portokalidis, “Out of
ware performance counters can detect malware: Myth or fact?,” in control: Overcoming control-flow integrity,” in IEEE Symposium on
ACM Asia Conference on Computer and Communications Security, Security and Privacy, pp. 575–589, IEEE, 2014.
2018. [96] L. Cheng, S. Ahmed, H. Liljestrand, T. Nyman, H. Cai, T. Jaeger,
[74] I. D. O. Nunes, K. Eldefrawy, N. Rattanavipanon, and G. Tsudik, N. Asokan, and D. Yao, “Exploitation techniques for data-oriented
“APEX: A verified architecture for proofs of execution on remote attacks with existing and potential defense approaches,” ACM Trans-
devices under full software compromise,” in 29th USENIX Security actions on Privacy and Security, vol. 24, no. 4, pp. 1–36, 2021.
Symposium, pp. 771–788, 2020. [97] Embench™: A Modern Embedded Benchmark Suite. https://www.
[75] C. Lattner and V. Adve, “Llvm: A compilation framework for life- embench.org/.
long program analysis and transformation,” in International Sympo- [98] J. Wilander, N. Nikiforakis, Y. Younan, M. Kamkar, and W. Joosen,
sium on Code Generation and Optimization, pp. 75–86, IEEE, 2004. “Ripe: Runtime intrusion prevention evaluator,” in 27th Annual
[76] G. Gange, J. A. Navas, P. Schachte, H. Søndergaard, and P. J. Computer Security Applications Conference, 2011.
Stuckey, “An abstract domain of uninterpreted functions,” in 17th [99] E. Brickell, J. Camenisch, and L. Chen, “Direct anonymous attesta-
Int’l Conference on Verification, Model Checking, and Abstract tion,” in 11th ACM Conference on Computer and Communications
Interpretation, Springer, 2016. Security, 2004.
[77] J. Zhao, S. Nagarakatte, M. M. Martin, and S. Zdancewic, “Formal- [100] C. Shepherd, R. N. Akram, and K. Markantonakis, “Remote cre-
izing the LLVM intermediate representation for verified program dential management with mutual attestation for trusted execution
transformations,” in 39th Annual ACM Symposium on Principles of environments,” in Information Security Theory and Practice: 12th
Programming Languages, 2012. IFIP WG 11.2 International Conference, Springer, 2019.
[78] T. Ball and J. R. Larus, “Efficient path profiling,” in 29th Annual
IEEE/ACM Int’l Symposium on Microarchitecture, IEEE, 1996.
[79] A. A. Clements, N. S. Almakhdhub, S. Bagchi, and M. Payer,
“ACES: Automatic compartments for embedded systems,” in 27th
USENIX Security Symposium, pp. 65–82, 2018.
[80] A. Q. Nguyen et al., “Capstone: The Ultimate Disassembler,” 2024.
https://www.capstone-engine.org.
[81] University of Wisconsin-Madison, “Dyninst,” 2024. http://www.
dyninst.org/.
[82] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney,
S. Wallace, V. J. Reddi, and K. Hazelwood, “Pin: building cus-
tomized program analysis tools with dynamic instrumentation,”
ACM SIGPLAN Notices, vol. 40, no. 6, pp. 190–200, 2005.
[83] D. Bruening and S. Amarasinghe, Efficient, transparent, and com-
prehensive runtime code manipulation. PhD thesis, Massachusetts
Institute of Technology, 2004.
[84] A. Engelke and M. Schulz, “Instrew: Leveraging LLVM for high
performance dynamic binary instrumentation,” in 16th ACM Int’l
Conference on Virtual Execution Environments, 2020.
[85] J.-P. Aumasson, S. Neves, Z. Wilcox-O’Hearn, and C. Winnerlein,
“BLAKE2: Simpler, smaller, fast as MD5,” in 11th Int’l Conference
on Applied Cryptography and Network Security, Springer, 2013.
[86] J.-P. Aumasson, S. Neves, Z. Wilcox-O’Hearn, and C. Winner-
lein, “BLAKE2: simpler, smaller, fast as MD5,” in 11th Interna-
tional Conference on Applied Cryptography and Network Security:,
pp. 119–135, Springer, 2013.
[87] D. Clarke, S. Devadas, M. Van Dijk, B. Gassend, and G. E. Suh, “In-
cremental multiset hash functions and their application to memory
integrity checking,” in 9th International Conference on the Theory
and Application of Cryptology and Information Security, pp. 188–
207, Springer, 2003.
[88] J. R. Larus, “Whole program paths,” ACM SIGPLAN Notices,
vol. 34, no. 5, pp. 259–269, 1999.
[89] T. N. Kipf and M. Welling, “Variational graph auto-encoders,” arXiv
preprint arXiv:1611.07308, 2016.
[90] F. Hausdorff, Grundzüge der mengenlehre, vol. 7. von Veit, 1914.
[91] Y. Collet et al., “RFC 8878: Zstandard compression and the ‘appli-
cation/zstd’ media type,” 2021.
[92] J. R. Larus, “Abstract execution: A technique for efficiently tracing
programs,” Software: Practice and Experience, 1990.
[93] A. A. Clements, N. S. Almakhdhub, K. S. Saab, P. Srivastava,
J. Koo, S. Bagchi, and M. Payer, “Protecting bare-metal embedded
systems with privilege overlays,” in IEEE Symposium on Security
and Privacy, pp. 289–303, IEEE, 2017.