Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
Utilizing Program’s Execution Data for Digital Forensics
Ziad A. Al-Sharif
Software Engineering Department
Jordan University of Science and Technology
Irbid, 22110, P.O. Box 3030, Jordan
zasharif@just.edu.jo
ABSTRACT
Criminals use computers and software to perform
their crimes or to cover their misconducts. Main
memory or RAM encompasses vibrant information
about a system including its active processes.
Program’s variables data and value vary in their
scope and duration in RAM. This paper exploits
program’s execution state and its dataflow to
obtain evidence of the software usage.
It
extracts information left by program execution
in support for legal actions against perpetrators.
Our investigation model assumes no information is
provided by the operating system; only raw RAM
dumps. Our methodology employs information
from the target program source code.
This
paper targets C programs that are used on Unix
based systems. Several experiments are designed
to show that scope and storage information of
various source code variables can be used to
identify program’s activities. Results show that
investigators have good chances locating various
variables’ values even after the process is stopped.
KEYWORDS
Digital Forensics, Memory Forensics, Memory
Dumps, Carving Variable Values, String Variables,
C Programs.
1
INTRODUCTION
Criminals use computers and software to
perform their crimes or to cover their
wrongdoings. Locating a program on the
machine’s hard disk might not be enough to
establish the definite usage of that program. An
evidence might be needed to confirm that the
perpetrator is actually used that program. This
evidence can be found in a couple of places,
one of which is the RAM of the used machine.
ISBN: 978-1-941968-37-6 ©2016 SDIWC
This emphasizes the significance of memory
forensics and its use in crime investigation.
Generally, programs vary on their dependency
on memory, CPU, disk I/Os, and networks [1].
A program’s control flow might highly depend
on various variables and their values that are
stored in different main memory locations
(RAM). These variables can be categorized
based on their scopes and execution lifetimes.
A scope determines the visibility of a variable
and where it can be accessed within the
program’s source code. In contrast, variable’s
storage (memory type) determines the duration
in which its value is created and destroyed
or deleted. Additionally, variables can be
classified based on whether they are allowed
to be changed during execution. Constant
variables are those that cannot be changed once
are assigned, most of which are often assigned
with literal values (hard coded values). These
literals might be unique to the executable
program and its execution state. Additionally,
many other non-constant variables might be
initialized with hard coded values (literals).
Assuming no information is available from
the operating system; only raw RAM dumps.
This paper locates evidences that would be
used to confirm the software usage and its
association with the crime. Our investigation
model is based on variables’ scopes and
memory types.
In order to verify our
research methodology, various experiments
and scenarios are developed. RAM memory
dumps are created and analyzed to locate
related variables’ value (literal and non-literal)
based on the program source code and its
execution state.
This paper targets C programs that run under
Unix based systems.
However, most of
our findings are equally applicable for other
12
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
languages and operating systems. Our results
show that regardless of whether the process is
active or just stopped, the memory investigator
can employ knowledge about the program
source code and its variables such as global
and local static and their potential values to
assure the program usage. Hence, values of
local auto variables are successfully located
when their corresponding stack frames are
still active. On the other hand, dynamically
allocated values can be located as long as the
program is not stopped and the corresponding
memory is not released.
The rest of this paper is organized as follows.
Section 2 highlights some of the background
knowledge used in this paper. Section 3
describes our investigation model and how it
employs information available in the program
source code to confirm that the program
is actually used. Section 4 presents our
four experiments and Section 5 discusses our
promising results. Section 6 presents some
of the related works. Finally, our planned
future work is presented in Section 7 whereas
Section 8 concludes our findings.
2
BACKGROUND
A software process may employ various
variables (memory storages). In C language,
variables can be classified based on their scope
and duration into global, local auto, and local
static [2]. Global and static data is allocated
by the runtime system for a program at its
start. These variables might be initialized with
default values whenever they are not explicitly
initialized by the programmer. The lifetime
(duration) of this kind of data is same as the
process that uses the data [3, 4]. However,
unlike global variables, the visibility of local
static variables is limited to the scope of its
function or block.
Typically, all operating systems provide
services to programs they run. In a Unix based
system, when the Kernel executes a C program,
a special routine (known as the startup routine)
is automatically invoked to set up the command
line arguments and the environments. Then,
the main() function is called.
ISBN: 978-1-941968-37-6 ©2016 SDIWC
Figure 1. A general view of the major logical segments
of the memory dedicated for a loaded process (running
program) under a Unix based system.
The Kernel manages software processes, each
of which is provided a dedicated memory space
in RAM [5, 6, 7]. When the executable starts,
various sections are allocated and loaded into
RAM, the starts and ends of these sections
are independent of the RAM page limits.
During execution, different variables are stored
in RAM into various logically classified
segments. Figure 1 shows a logical view of the
major memory segments that are dedicated to a
running process. A loaded process consists of
the following major segments:
.text: is a read-only segment that contains
the binary instructions (executable) located
below the heap and stack.
.rodata: is a read-only segment that contains
the immutable variables; read-only constants
and string literals.
.data: is a read-write segment that contains
global and local static mutable variables that
are explicitly initialized by programmers.
.bss: is a read-write segment that contains
global and local static mutable variables that
13
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
are uninitialized by the programmer. Usually,
variables in this segment are initialized by the
Kernel before the program starts.
Heap: is a memory segment allocated when
the process starts. It provides runtime memory
allocation for variables and their values as
needed during execution. Program’s data
that lives in heap can be referenced outside
the function scope.
In C language, the
heap memory allocations are managed by the
program with help from the Kernel through
system calls such as malloc(), calloc() and
recalloc(). An explicit request can be initiated
to release these allocated memory using system
calls such as free() [3, 4].
Stack: is a memory segment allocated when
the program starts and it is automatically
managed by the Kernel and its runtime system.
It consists of blocks called activation records
or frames, each of which represents a call
to a function and provides storage for its
corresponding local and formal parameters.
The lifetime (duration) of variables allocated
on the stack is same as the scope in which they
are declared (mostly the function and its stack
frame) [3, 4].
Figure 2 shows a sample C program
with various variables’ scopes and their
correspondence to the logical view presented
in Figure 1. Hence, memory investigators can
utilize the memory of various variables’ values
to locate evidences about the actual use of the
software.
3
INVESTIGATION MODEL
A variable scope affects its visibility within the
program source code and a variable storage
affects its duration. Hence, different scopes
and storages might affect the survivability of
a variable’s value in memory during various
execution states. Accordingly, locating these
values in a RAM memory dump can be
used as evidence to prove that a user is
actually used the presumed program. Our
ISBN: 978-1-941968-37-6 ©2016 SDIWC
Figure 2. Sample C program shows different variables
and their corresponding logical memory segments and
duration.
investigation model studies the possibility of
locating these variables’ values that are used
within different scopes and storage types. Our
experimentations study three different scopes:
global, local auto, and local static. It also tries
to distinguish between various values within
different execution states, see Figure 3. These
execution states are based on various scenarios
such as:
• The variable is used or not-used yet
during program execution
• The variable was used in a currently active
or inactive stack frame
• The variable is never used; the variable is
never reached or the stack frame is never
been active
• The allocated variable’s data is never
released or just released
• The software process is live (still running)
or dead (just stopped)
Furthermore, our investigation model assumes
no information is provided by the operating
system, only memory dumps that are created
during various execution states and scenarios.
14
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
Then, each memory dump is searched for
potential values related to the source code of
the presumed program, see Figure 4.
Figure 4. Investigation Model: Step 1 represents the
process of crating a memory dump. Step 2 represents
the searching process for potential values related to the
target program source code.
assigned during program execution. Thus,
experiments 1, 2, and 3 are designed to
investigate variables assigned with string
literals during different execution states.
On the contrary, experiment 4 is designed
to investigate variables’ values that are
dynamically allocated and modified. Our
experiments explore three different variable’s
scopes: global, local auto, and local static.
Figure 3. Various variables’ states that are explored
during our experiments. #1 represents a literal value
within an active frame and a live process. #2 represents
a literal value within a currently inactive frame and a live
process. #3 represents a literal value within a currently
inactive frame and a dead process. #4 represents a
dynamically allocated value first within a live process
and then within a dead process.
Experimentation Setup: in all four
experiments, we used a Linux virtual machine
that is created using VirtualBox. The VM runs
openSUSE Linux version 13.1. with 512 MB
of RAM memory. This VM is hosted on a Mac
OS X 10.11.5. See Figure 5.
4.1 Experiment #1
4
EXPERIMENTS
Four experiments are designed, each of
which explores the potential evidence that
would prove the actual software usage during
various execution states, see Figure 3. A
variable can be assigned a literal or non-literal
value.
Non-literal values are those that
are dynamically calculated or modified and
ISBN: 978-1-941968-37-6 ©2016 SDIWC
First experiment is designed to explore the
use of a literal string in a currently active
stack frame and whether it affects the ability
to locate this string in a memory dump that
is created during a live process, see #1 in
Figure 3. It investigates three different variable
scopes: global, local auto, and local static,
each of which is initialized at declaration time
15
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
A memory dump is seized in each of these
states for each of the three explored variable
scopes. Then, these dumps are searched for
these literal values. Results of our findings are
discussed in Section 5.
4.3 Experiment #3
Figure 5. Experimentation Setup: the VM is created
using Oracle’s VirtualBox.
with a string literal. It explores two different
states within an active stack frame of an active
process:
• State 1: The variable is used; reached in
one of the executed statements
• State 2: The variable is not-used yet; not
reached in any of the thus far executed
statements
A memory dump is seized in each of these
states for each of the explored variables. Each
of these dumps is searched for the subject
variable and its literal string value. The results
of our findings are presented in Section 5.
4.2
Experiment #2
Second experiment is designed to explore the
use of a literal string in a currently inactive
stack frame and whether it affects the ability
to locate this string in a memory dump, see #2
in Figure 3. Similar to the first experiment,
this one targets live processes with three
different variable scopes: global, local auto,
and local static, each of which is initialized
at declaration time with a string literal. It
explores three different variable’s states within
an inactive stack frame of an active process:
• State 1: The variable was used; read or
assigned in one of the executed statements
• State 2: The variable was not used; not
read or assigned in any of the thus far
executed statements
• State 3: The stack frame was never active;
the function is never called
ISBN: 978-1-941968-37-6 ©2016 SDIWC
Third experiment is very similar to the second
experiment. Except, it explores the effects of
having an inactive process (just stopped) on
the same three scopes and the same three states
investigated during the second experiment, see
#3 in Figure 3. A memory dump is seized in
each one of these states for each of the three
explored variables’ scopes. Then, these dumps
are searched for these literal values. Results of
our findings are presented in Section 5.
4.4 Experiment #4
Fourth experiment is designed to investigate
the dynamically allocated string variables and
contrast them with variables that are initialized
with literal values in the source code. This
experiment explores the possibility of locating
variables’ values that are allocated in the heap
memory, see #4 in Figure 3.
In this experiment, variables are dynamically
allocated using malloc() and assigned a string
value from another string literal using the
strcpy() function, then some characters are
modified to distinguish the string resides in
the dynamically allocated heap space from the
original literal string. It explores the potential
of locating these string values in four different
states:
• State 1: malloc(), strcpy(), and one
character is modified
• State 2: malloc(), strcpy(), one character
is modified, then the free() function is
called
• State 3: malloc(), strcpy(), one character
is modified, the free() function is called,
and then the process is terminated
normally
16
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
• State 4: malloc(), strcpy(), one character
is modified, and then signal SIGINT
(Control-C) is used to terminate the
program abnormally. No free() is called
explicitly by the user program
States 3 and 4 are designed specifically to
explore the consequences of having a process
that is terminated normally and a process that
is terminated abnormally. A memory dump
is seized in each one of these states. Then,
these memory dumps are searched for these
dynamically allocated and modified string
values. Results of our findings are discussed
in Section 5.
5
RESULTS
This section thoroughly presents the results
from all four experiments.
Results from first experiment: show that
the values of global and local static variables
have two occurrences each. Whereas, the
value of local auto variable has only one
occurrence in both states (State 1 and State
2), see Table 1. This means, in an active
stack frame, having the referenced variable
used or not-used does not affect the number of
occurrences that can be found of the searched
value. It also means that the investigator can
find double occurrences of global and local
static variables initialized with literal strings
and only one occurrence can be found for local
auto variables.
Table 1. Results from the first experiment show that
global and local static variables have two occurrences
whereas the local auto variable has only one occurrence
for its value. States 1 & 2 show that having the
variable used or not-used does not affect the number of
occurrences in all investigated scopes.
Var. Scope
Global
Local (auto)
Local Static
State 1
2
1
2
State 2
2
1
2
Results from second experiment: show that
the values of global and local static variables
ISBN: 978-1-941968-37-6 ©2016 SDIWC
are found twice in the RAM dump (two
occurrences). However, the value of local auto
variable is never found in any of the three
investigated states. This means that having
an inactive stack frame during a live process
reduces our chances of locating the values of
local auto variables to zero. Whereas, having
an active or inactive stack frame does not
affect the values of global and local static
variables; at least in our investigation setup,
which consists of relatively small programs.
Table 2 presents our findings for each of the
three different variables’ scopes and each of
the three investigated states.
Table 2. Results from the second experiment show that
global and local static variables have two occurrences
whereas the local auto variable has zero occurrence
for its value. States 1, 2, & 3 show that having the
variable used or not-used does not affect the number
of occurrences in all investigated states, even when the
stack frame is never active.
Var. Scope
Global
Local (auto)
Local Static
State 1
2
0
2
State 2
2
0
2
State 3
2
0
2
Results from third experiment: show that
the value of the local auto variable is never
found in any of the three investigated execution
states (zero occurrence). This goes along with
the results from second experiment. However,
the number of occurrences of values of global
and local static variables is decreased from
two occurrences to only one occurrence for
each variable in each state. This means, if the
process is inactive (dead), the investigator has
a one chance to locate literal values of global
and local static variables (one occurrence).
Table 3 presents our findings for each of the
three different scopes and each of the three
investigated states.
Results from fourth experiment: show that
the investigator have a chance to locate
a dynamically allocated string value that
resides in the heap memory for a dynamically
allocated variable as long as its process
17
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
Table 3. Results from the third experiment show
that global and local static variables have only one
occurrence whereas the local auto variable has zero
occurrence for its value in the RAM dump (when the
process is inactive). This is true in all of the three states,
which means having the variable used or not use does
not affect the results but having an active or inactive
process does affect the results. States 1, 2, & 3 show
that having the variable used or not-used does not affect
the number of occurrences in all investigated states, even
when the stack frame is never active.
Var. Scope
Global
Local (auto)
Local Static
State 1
1
0
1
State 2
1
0
1
State 3
1
0
1
is live and the value is not released yet
(free() function is not called) explicitly in the
program. Otherwise, we have a zero chance
of locating any of these string values; at least
in our experimentation setup. Table 4 shows
the number of occurrences for the investigated
string value within four different execution
states.
Table 4. Results from the fourth experiment show that
global, local auto, and local static variables have only
one occurrence in State 1 (where the process is active
and the (free() function is not called). Whereas, all
variables’ scopes have zero occurrences in all of the
other three states. This means we have a chance to locate
dynamically allocated strings only in State 1.
Var. Scope
Global
Local (auto)
Local Static
6
State 1
1
1
1
State 2
0
0
0
State 3
0
0
0
State 4
0
0
0
RELATED WORK
Many researchers find in the RAM memory a
vital source of information that can be used
in support for legal actions against criminals
in digital forensic cases [8, 9, 10, 11, 12, 13].
Ahmad Shosha et al. developed a prototype
to detect different malicious programs that
are regularly used by criminals.
The
proposed approach depends on the deduction
of evidences that are extracted based on traces
related to the suspect program [14]. Chan
Ellick et al. introduced ForenScope [15]
a RAM forensic tool that permits users to
ISBN: 978-1-941968-37-6 ©2016 SDIWC
investigate a machine using regular bash-shell.
It allows users to disable anti-forensic tools
and search for potential evidences. In order
to maintain the RAM memory intact, it is
designed to work in the unused memory space
on the target machine. Petroni et al. introduced
FATKit [16]. It is a digital forensic tool
dedicated to extract, analyze, and visualize
the digital forensic data. It utilizes program
source code and its data structure during
the analysis of memory dumps. Arasteh et
al. extracts evidences from RAM memory
based on the logic of the process that is
extracted from its stack memory segment [17].
Funminiyi Olajide et al. uses RAM dumps to
extract user’s input information from Windows
applications [18]. Narasimha Shashidhar et al.
targeted the prefetch folder and its potential
value to the investigator. This prefetch folder is
used to speed up the startup time of a program
on a Windows Machine [19].
7
FUTURE WORK
For future work, we are planning to investigate
other environments: Windows, Mac, and
small devices such as phones and tablets.
Some languages have its own memory
management system and its own virtual
machine while the other just like C depends
directly on the operating system in their
memory management. We plan to investigate
the differences in the behavior of various
programing languages such as C++, Java, C#,
and Python. Furthermore, we are looking
forward to investigate similar scenarios for
other data types and data structures. Finally, it
would be important to investigate various types
and their impacts on long running programs
such as servers.
8
CONCLUSION
This paper utilizes information from the
source code of the a program and employs
program’s execution data during various
execution states to help investigator establish
the evidence against a perpetrator.
This
will allow law enforcements to take legal
18
Proceedings of the Third International Conference on Digital Security and Forensics (DigitalSec), Kuala Lumpur, Malaysia, 2016
actions against criminals in the court of
law. Our experimentation is based on the
C programming language. Based on these
experiments, we found that utilizing source
code information can be valuable to the
investigator. It helps establish the evidence that
the perpetrator is actually used the software to
perform the crime or to cover the wrongdoing.
Various string literals and non-literals related
to the program execution are successfully
located during various scenarios and execution
states.
REFERENCES
[1] M. H. Ligh, A. Case, J. Levy, and A. Walters, The
art of memory forensics: detecting malware and
threats in windows, linux, and mac memory. John
Wiley & Sons, 2014.
[2] M. Banahan, D. Brady, and M. Doran, The C book.
No. ANSI-X-3-J-11-DRAFT, Addison-Wesley
New York, 1988.
[3] D. P. Bovet and M. Cesati, Understanding the
Linux kernel. ” O’Reilly Media, Inc.”, 2005.
[4] A. Josey, D. Cragun, N. Stoughton, M. Brown,
C. Hughes, et al., “The open group base
specifications issue 6 ieee std 1003.1,” The IEEE
and The Open Group, vol. 20, no. 6, 2004.
[5] E. Youngdale, “Kernel korner: The elf object file
format by dissection,” Linux Journal, vol. 1995,
no. 13es, p. 15, 1995.
[6] H. Lu, “Elf: From the programmer’s perspective,”
in NYNEX Science & Technology Inc, Citeseer,
1995.
[7] W. R. Stevens and S. A. Rago, Advanced
programming in the UNIX environment.
Addison-Wesley, 2013.
[8] M. I. Al-Saleh and Z. A. Al-Sharif, “Utilizing
data lifetime of tcp buffers in digital forensics:
Empirical study,” Digital Investigation, vol. 9,
no. 2, pp. 119–124, 2012.
[9] Z. A. Al-Sharif, D. N. Odeh, and M. I.
Al-Saleh, “Towards carving pdf files in the
main memory,” in The International Technology
Management Conference (ITMC2015), pp. 24–31,
The Society of Digital Information and Wireless
Communication, 2015.
ISBN: 978-1-941968-37-6 ©2016 SDIWC
[10] V. S. Harichandran, D. Walnycky, I. Baggili, and
F. Breitinger, “Cufa: A more formal definition
for digital forensic artifacts,” Digital Investigation,
vol. 18, pp. S125–S137, 2016.
[11] M. Rafique and M. Khan, “Exploring static
and live digital forensics: Methods, practices
and tools,” International Journal of Scientific
& Engineering Research, vol. 4, no. 10,
pp. 1048–1056, 2013.
[12] F. N. Dezfoli, A. Dehghantanha, R. Mahmoud,
N. F. B. M. Sani, and F. Daryabar, “Digital
forensic trends and future,” International Journal
of Cyber-Security and Digital Forensics (IJCSDF),
vol. 2, no. 2, pp. 48–76, 2013.
[13] L. Cai, J. Sha, and W. Qian, “Study on
forensic analysis of physical memory,” in
Proc. 2nd International Symposium on Computer,
Communication, Control and Automation (3CA
2013), 2013.
[14] A. F. Shosha, L. Tobin, and P. Gladyshev, “Digital
forensic reconstruction of a program action,” in
Security and Privacy Workshops (SPW), 2013
IEEE, pp. 119–122, IEEE, 2013.
[15] E. Chan, W. Wan, A. Chaugule, and R. Campbell,
“A framework for volatile memory forensics,”
in Proceedings of the16th ACM conference on
computer and communications security, 2009.
[16] N. L. Petroni, A. Walters, T. Fraser, and W. A.
Arbaugh, “Fatkit: A framework for the extraction
and analysis of digital forensic data from volatile
system memory,” Digital Investigation, vol. 3,
no. 4, pp. 197–210, 2006.
[17] A. R. Arasteh and M. Debbabi, “Forensic memory
analysis: From stack and code to execution
history,” digital investigation, vol. 4, pp. 114–125,
2007.
[18] F. Olajide, N. Savage, G. Akmayeva, and
C. Shoniregun, “Identifying and finding forensic
evidence on windows application,” Journal of
Internet Technology and Secured Transactions,
ISSN, pp. 2046–3723, 2012.
[19] N. K. Shashidhar and D. Novak, “Digital forensic
analysis on prefetch files,” International Journal
of Information Security Science, vol. 4, no. 2,
pp. 39–49, 2015.
19