A Crash Course On x86 Disassembly
A Crash Course On x86 Disassembly
A Crash Course On x86 Disassembly
Chapter
4
Levels of Abstraction
Computer systems: several levels of
abstractions – hide implementation
details.
Six levels of abstractions:
Hardware: digital logic gates (AND,
OR, XOR, NOT)
Microcode: firmware – interface
with hardware
Machine Code: opcodes, hex digits
(after compile)
Low-level languages: instruction set,
human readable
High-level language: C/C++ ->
compiled into machine code
Interpreted languages: C#, Java
->translated into bytecode (translated
into machine code)
PC Architectures
Focus: x86 32-bit, Intel IA-32
Other architectures: x64, MIPS,
ARM
Incorrect specification will lead to errors, and the program is most likely
to crash.
AL: lower 8
bits AH:
Registers
• EAX, EDX for multiplication and division
Multiplies the unsigned operand by EAX and stores the result in a 64-
bit value in EDX:EAX. EDX:EAX means that the low (least
significant) 32 bits are stored in EAX and the high (most significant)
32 bits are stored in EDX.
Division: divides 64 bits across EDX and EAX by value. Result
stored in EAX, remainder in EDX.
Use of registers follow certain conventions
E.g. EAX generally contains return value for function calls.
Important for malware analyst to know conventions to
examine the code quickly
Flags
EFLAGS register: status register (32 bit, each bit is
a flag). Some important flags
ZF: zero flag, set if operation is zero
CF: carry flag, set if operation is too large for
destination operand
SF: sign flag set if operation is negative
TF: trap flag used for debugging (x86 execute one
instruction at a time if set)
Extended Instruction Pointer (EIP)
EIP: a register contains the memory address of
the next instruction to be executed
Tell the processor what to do next
If EIP is corrupted, points to a memory address that
is not legit, program will crash
Attackers controls EIP through exploitation – have
attack code in memory, then change EIP to point to
that code to exploit a system
Buffer overflow attacks
Instructions
Mov: mov destination, source – move data into registers or
RAM
• Lea: load effective address – put a memory address into the destination, e.g. lea eax,
[ebx+8] -
> put EBX+8 into EAX
• Mov eax, [ebx+8] -> loads the data at memory address specified by EBX+8
• Lea eax,[ebx+8] = mov eax, ebx+8
Arithmeti
c Addition: add destination, value
Subtraction: sub destination, value (ZF set if zero;
CF set if destination < value)
Inc/Dec: increment or decrement a register by one
Multiply and division: act on predefined
registers
Mul value : multiplies EAX by value.
Results stored as 64-bit value: EDX and EAX.
EDX most significant 32 bits, EAX least
significant 32 bits
Div value: divides the 64 bits across EDX
and EAX by
value. Results stored in EAX, remainder in
EDX.
Logical Operations
Or, AND, XOR: xor eax, eax -> set EAX
to zero (optimization for clear register)
33 C0 xor eax eax; B8 01 00 00 00 mov eax,1
-> 2 bytes vs. 5 bytes
shr/shl: shift register to right/left.
Shr destination, count
Bits shifted beyond boundary are first shifted
into CF.
ror/rol: rotate – no fall off, bits shift to the
other side
Shifting: an optimization of multiplication
-> each shift left-
> multiples by two; n bits -> ?
Stack
Last In First Out (LIFO)
What are stored in a stack ?
Functions, local variables, flow controls
ESP and EBP registers
ESP -> stack pointer (memory adrs top of the stk)
EBP-> base pointer (stays consistent within a given
function -> for keeping track of local )
variables/parameters
Short-term storage
Addrs “grows” from high to low
Function Calls
Prologue: prepares the stack and
registers
Epilogue: restore the stack and
registers
1.Arguments are pushed on the stack
2.Function is called using
call memory_location
(contents of the EIP
register) is pushed onto
stack
EIP set to memory_location
(the start of the function –
for return)
4. Finish: ESP is adjusted to Pusha: push 16-bit registers in order: AX, CX, DX, BX,
free local
3.Space variables;
allocated EBP is
for local SP, BP, SI, DI (compilers rarely use; shellcode stores
restored ; EBP
variables and pops return
is pushed intermediate states) Pushad: push 32-bit registers in order:
address
onto the off the stack into
stack. EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
EIP (for next instructions)
Example (function call)