Introduction To x64 Assembly

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Introduction to x64 Assembly

.intel syntax file, gcc skips its C front-end, and just assembles
.text
and links the file. We need to provide the option
.globl main
-m64 to choose x64 assembler. Putting it all to-
main:
/* function prologue */ gether, we get the following command-line:
push rbp gcc -m64 main.s
mov rbp, rsp
• The previous step produced an executable file
/* call puts("Hello, World!") */
lea rdi, [rip + main.S 0] a.out with the machine code. Run it:
call puts ./a.out
/* return zero */
While we are on the topic of tooling, you can also
mov rax, 0
mov rsp, rbp use gcc to compile a C program to assembly by us-
pop rbp ing the gcc command-line option -S. This option in-
ret structs gcc to run the front-end only. That is a use-
main.end: ful approach for getting concrete examples of various
main.S 0: code sequences. If you specify -Wall to enable warn-
.string "Hello, World!" ings and -ansi to select the C dialect, the following
command-line compiles a source program main.c to a
target program main.s:
Figure 1: A simple program in x64 assembler. gcc -Wall -ansi -m64 -masm=intel -S main.c
See also: http://www.inf.usi.ch/faculty/soule/
teaching/2015-fall/cc/x64-intro/hello_world.txt
2 x64 Syntax
Abstract Rather than give a formal grammar for x64, this sec-
The name x64 refers to a 64-bit instruction set for In- tion describes it using the example in Figure 1. There
tel and AMD processors, which are commonly found is one statement per line, which means that changing
in current-generation laptop and desktop computers. newlines would change the meaning of the program.
This document introduces a subset of x64, including Other than that, the syntax is insensitive to whites-
the features needed for a Compilers course at USI. pace, meaning that additional spaces, tabs, or com-
We write assembler files using “Intel syntax”, and we ments do not affect program behavior. Comments
adopt the C calling conventions of Mac OS X. start with /* and end with */.
The example program contains two kinds of state-
ments: directives and instructions. Directives start
1 Example: Hello, World! with a period, such as .intel syntax, whereas instruc-
tions consist of an operator and a list of operands,
Figure 1 shows an example program in x64 assembler such as mov rbp, rsp. In addition, a statement can
that prints a greeting to standard output. Before we start with labels, which are symbols followed by
look at what this does, let’s try running it. The steps colon, such as main.end:.
are as follows: The directive .intel syntax at the start of the file
• Put the code in a file called main.s. The file selects Intel syntax. Without that directive, the de-
extension .s indicates an assembler file. fault is .att syntax. One major difference between
the two options is the order of operands: Intel syntax
• Run the assembler and linker. We use gcc for shows the destination operand first, whereas AT&T
this. When the input file to gcc is an assembler syntax shows the destination operand last. While the

Introduction to x64 assembly


Dragon book does not show any x64 instructions, it Label operands come in two forms. On
does adopt the destination-first convention for assem- the one hand, control transfer instructions,
bler code, so using Intel syntax is less confusing. such as jmp or call, use code labels, which
The directive .text tells the assembler to put are simply the symbol name, such as main.
the following statements in the text section, On the other hand, to get the address of
which contains executable code. The directive data stored in the text section we can use
.string "Hello, World!" copies the characters in the lea (Load Effective Address) instruction
string to the binary file, and ends it with a 0-byte. where the label serves as an offset relative to
The string constant may contain escapes, such as \n the current execution address (stored in rip
for newline. register), like lea dx, [rip + main.S 0] in
The directive .globl main makes the label main Figure 1.
visible to the linker so it can be called from outside.
To summarize, we start each file with • Memory operands (m). These addresses add
.intel syntax, followed by .text directive declaring a constant to a register, and then deref-
the text section containing code for the functions and erence the resulting pointer. For example,
read-only data, such as strings. We won’t need to qword ptr [rax+8] takes the value of register rax,
declare any mutable data in the assembly file, as we adds 8, interprets the result as a pointer, and
will use dynamically allocated memory to represent dereferences it. The offset can also be negative,
Tack values. Function should have a .globl directive such as qword ptr [rsp-16], or it can be omitted
matching the function’s start label for it to be called when zero, such as qword ptr [rbp].
from outside (for our purposes it suffice to only make
main global as our simple compiler only supports Not every instruction accepts every kind of ad-
single-file programs). Please note, that C compiler dress. The abbreviations r, i, and m serve
adds underscore prefix to function names, so all C to indicate which addressing modes are sup-
library function names are prefixed and C run-time ported. To learn more about addresses, see
system is expecting your main to have this prefix as Volume 1 of the “Intel 64 and IA-32 Archi-
well. To learn more about as (the GNU assembler), tectures Software Developer’s Manual”: http:
see the user manual: //www.intel.com/content/www/us/en/processors/
http://sourceware.org/binutils/docs-2.21/as architectures-software-developer-manuals.html/

3 Addresses 4 Instructions
The following reference lists instructions in alpha-
As mentioned before, an instruction consists of zero
betical order. When an instruction has multiple
or more labels, an operator, and zero or more
addressing modes, the alternatives are separated by
operands. We refer to operands as “addresses”, even
a vertical bar |. As a general rule of thumb, most
when they are non-pointer values. We use the follow-
instructions support only one memory operand (m),
ing kinds of addresses:
not two. Typically, the first operand is a destination
operand, in other words, many instructions store
• Registers (r). There are sixteen 64-bit general-
their result in the first operand.
purpose registers: rax to rdx, rsp, rbp, rsi, rdi,
and r8 to r15. However, some of these registers add → add r, i | add r, r | add r, m | add m, i | add m, r
play a special role, for example, rsp and rbp typ- Compute the sum of the two operands, and store the
ically hold the stack pointer and base pointer, as result in the first operand.
their names imply.
call → call label
• Immediate operands (i). These are either integer Store the return address into [rsp]. Subtract 8 from
constants or labels. rsp. Jump to the label.

Integer constants are written as either the cmp → cmp r, i | cmp r, r | cmp r, m | cmp m, i | cmp m, r
digit 0, or a digit from 1-9 followed by zero Compare the two operands. Encode the result in
or more digits from 0-9. status flags in an internal register, which can then be

Introduction to x64 assembly


used for the various conditional jump instructions: Copy the operand value to [rsp], then subtract 8
je, jg, jge, jl, jle, and jne. from rsp.
idiv → idiv r | idiv m ret → ret
Treat edx:eax as a single, signed 128-bit integer Retrieve the return address from [rsp]. Add 8 to
value. Divide this value by the operand. Store the rsp. Jump to the return address.
rounded-down quotient in rax, and the remainder
shl → shl r, i | shl m, i
in %rdx. A common idiom to prepare edx for this
instruction is to first do mov rdx, rax; sar rdx, 63, Perform a left-shift on the first operand, with the
which fills rdx entirely with the appropriate sign bit. amount given by the second operand. A left-shift
fills in with zero bits.
imul → imul r, r | imul r, m
sar → sar r, i | sar m, i
Compute the product of the two signed integer
operands, and store the result in the first operand. Perform an arithmetic right-shift on the first
operand, with the amount given by the second
jmp → jmp label operand. An arithmetic right-shift preserves the
Jump unconditionally to label. sign, by filling in with the left-most (sign) bit.
je → je label shr → shr r, i | shr m, i
Jump to label if the first operand of the preceding Perform a logical right-shift on the first operand, with
cmp instruction was equal to the second operand. the amount given by the second operand. A logical
right-shift ignores the sign, by filling in with zero bits.
jg → jg label
Jump to label if the first operand of the preceding sub → sub r, i | sub r, r | sub r, m | sub m, i | sub m, r
cmp instruction was > the second operand. Subtract the second operand from the first operand,
and store the result in the first operand.
jge → jge label
Jump to label if the first operand of the preceding lea → lea r, m
cmp instruction was ≥ the second operand. Load the address specified by memory operand and
store it into the register.
jl → jl label
Jump to label if the first operand of the preceding The x64 instruction set has many more instruc-
cmp instruction was < the second operand. tions than shown here. Furthermore, most of the
instructions support more addressing modes than
jle → jle label listed. The reference here should suffice for our
Jump to label if the first operand of the preceding compiler construction project, but if you want to
cmp instruction was ≤ the second operand. learn more, see Volume 2 of the “Intel 64 and IA-32
Architectures Software Developer’s Manual”: http:
jne → jne label //www.intel.com/content/www/us/en/processors/
Jump to label if the first operand of the preceding architectures-software-developer-manuals.html/
cmp instruction was 6= the second operand.

mov → mov r, i | mov r, r | mov r, m | mov m, i | mov m, r


5 Calling Conventions
Copy the value of the second operand to to the first
operand. We adopt the same calling conventions as for the C
programming language, because that enables us to
neg → neg r | neg m
call external functions defined in the runtime, such
Replace the operand with its two’s complement as, a print function. In the assembly code of the
negation, i.e., signed integer minus. caller, the calling sequence is the same, irrespective of
pop → pop r | pop m whether the callee is written in assembly or compiled
Copy the value from [rsp] to the operand, then from C. In fact, Figure 1 shows a call from assembly
add 8 to rsp. to the C function puts. Conversely, C code can call
assembly functions. In fact, main gets invoked from
push → push i | push r | push m “outside”.

Introduction to x64 assembly


The stack grows “down”, which means that new have to take care about this alignment while allocat-
slots are added at the bottom, and older slots reside ing space for arguments passed via stack.
at higher addresses. This is reflected in some of the A non-void function returns its result through reg-
instructions we saw in the previous section (push, pop, ister rax. Other than that, the function epilogue
call, and ret). resets rsp back to the start of the frame, pops the
old value of rbp, and uses the ret instruction to pop
the return address and jump back to the caller. The
%rbp − 8 caller local following pseudo-code shows the callee’s return se-
... variables and
%rbp − frame size temporaries quence:
%rsp + 8 * (n−7) arg[n−1] %rbp + 8 * (n−5) arg[n−1]
... ... ... move rax, /*return value*/
%rsp + 8 arg[7] %rbp + 24 arg[7] move rsp, rbp
%rsp arg[6] %rbp + 16 arg[6] pop rbp
%rbp + 8 return address ret
%rbp caller %rbp
%rbp − 8 callee local One issue we have not yet discussed is caller-save
... variables and
%rsp = %rbp − frame size temporaries registers and callee-save registers. As the name im-
plies, the caller must save values of caller-save regis-
(a) After pushing arguments, (b) After making call, ters before it makes the call, as they may be lost when
before making call after function prologue
the callee overwrites them. In other words, caller-save
Figure 2: Stack layout for C calling conventions on registers “belong to” the callee. On the other hand,
x64/Mac OS X. the callee must save values of callee-save registers in
the prologue sequence and restore them in the epi-
Figure 2 shows the stack layout, both before and logue sequence, as the caller may expect that their
after a function call. If the function has more than 6 value after the return is the same as before the call.
arguments, then arguments 0 . . . 5 get passed in reg- In other words, callee-save registers “belong to” the
isters rdi, rsi, rdx, rcx, r8, and r9, and arguments caller.
6 . . . n − 1 get passed on the stack. If the function In the C calling conventions for x64/Mac OS X,
has at most 6 arguments, all arguments get passed in registers rbp, rbx, and r12 thru r15 belong to the
registers. Just before the caller executes the call in- caller (are callee-save registers), and all remaining
struction, the stack layout is as shown in Figure 2(a), registers belong to the callee (are caller-save regis-
with register rsp (the stack pointer) pointing to the ters). However, it is often not necessary to save and
lowest argument on the stack. restore registers, since they may not hold live values.
The call instruction pushes the return address on For example, consider the caller-save register rdx. If
the stack. Then, the callee is responsible for pushing the caller does not keep a value in rdx across a call,
the old value of rbp (the base pointer) on the stack, it does not need to save and restore rdx.
and setting the new value of rbp to point to the lo- Calling conventions are often part of the so-called
cation of the old value. After that, the callee can ABI, which stands for “application binary interface”.
use further stack space for its own local variables and The calling conventions described here are only a
temporaries. In that case, rbp remains unchanged, subset of the C ABI for x64/Mac OS X: we only
pointing to the base of the stack frame, whereas rsp discuss values of size 8 bytes that can be stored in
points to the end of the stack frame, as shown in a single register or stack slot, and we only discuss
Figure 2(b). The following pseudo-code shows the general-purpose registers, no floating point or vec-
callee’s prologue sequence: tor registers. If you want to learn more about the
full-fledged ABI, you can use the following document:
push rbp http://x86-64.org/documentation/abi.pdf
move rbp, rsp
sub rsp, /*frame size*/

Please note that Mac OS X dynamic linker requires


stack pointer (rsp register) to be 16-byte aligned. To
achieve this, you have to make sure that the allo-
cated frame size is always a multiple of 16. Also, you

Introduction to x64 assembly

You might also like