Chapter 2.12: Compilation, Assembling, Linking and Program Execution
Chapter 2.12: Compilation, Assembling, Linking and Program Execution
Chapter 2.12: Compilation, Assembling, Linking and Program Execution
2
Compilation Process in C
• Compilation process: gcc hello.c -o hello
– Constructing an executable image for an application
– Multiple stages
– Command:
gcc <options> <source_file.c>
• Compiler Tool
– gcc (GNU Compiler)
• man gcc (on Linux m/c)
3
4 Stages of Compilation Process
Preprocessing
gcc -E hello.c -o hello.i
hello.c à hello.i
Compilation (after preprocessing)
gcc -S hello.i -o hello.s
4
4 Stages of Compilation Process
1. Preprocessing (Those with # …)
– Expansion of Header files (#include … )
– Substitute macros and inline functions (#define …)
2. Compilation
– Generates assembly language, .s file
– Verification of functions usage using prototypes
– Header files: Prototypes declaration
3. Assembling
– Generates re-locatable object file (contains m/c instructions), .o file
– nm app.o
0000000000000000 T main
U puts
– nm or objdump tool used to view object files
5
4 Stages of Compilation Process (contd..)
4. Linking
– Generates executable file (nm tool used to view exe file)
– Binds appropriate libraries
• Static Linking
• Dynamic Linking (default)
• man gcc
7
Preprocessing
• Things with #
– #include <stdio.h>
– #define REAL float
– Others
• Processes the C source files BEFORE handing it to compiler.
– `Pre`-process
– gcc –E
– cpp
8
File Inclusion
• Recall : #include <filename>
– #include <foo.h>
• System directories
– #include “foo.h”
• Current directories
– gcc –I/usr/include to specify where to search those
header files
• gcc –I/usr/include sum_full.c –o sum
9
Macros
• Define and replaced by preprocessing
– Every occurrence of REAL will be replaced with float before
compilation.
10
About printf in C
• printf(“format string”,vars);
• Format string?
– “This year is %d\n”
– “Your score is %d\n”
• Conversion by %
– %d : int
– %f : float, double
– %c : char
– %s : char *, string
– %e : float, double in scientific form
11
Tools and Steps for Program Execution
User-created files
C/C++
C/C++Source
Source Assembly
Assembly Linker
Makefile and
andHeader
Header Source
Source Script
Files
Files Files
Files File
compiler assembler
Object
Object
Archive Utility Files
Files
Library
Library
Files Linker and Locator
Files
Shared
Linkable Executable Link Map
Object
Image File Image File File
File
12
Code Can be in Assembly Language
• Assembly language either is written by a programmer or is
the output of a compiler.
13
High-Level Program, Assembly Code and Binary
14
Hand-On, sum x86_64
https://passlab.github.io/ITSC3181/exercises/sum/sum_full.c
https://passlab.github.io/ITSC3181/exercises/sum/sum_full_x86.s
• A method in assembly
– .globl: a global symbol
– .type
– .cfi_startproc
– .cfi_endproc
– ret: return
• for loop
– check i<N, if true continue, else
goto end;
– loop body
– i++
– end 15
Sum, RISC-V RISC-V Version MIPS Version
and MIPS
• Mainly different
instructions
• for loop
– check i<N, if true,
continue, else goto end;
– loop body
– i++
– end
16
Sum, x86_64
• Number of instructions per loop
iteration
– Count it
CPU Time 𝑠
# Instructions # Clock cycles Seconds
= × ×
Program Instruction Clock cycle
17
When to Use Assembly Language
• Advantage: Speed, size and predictable
– No compiler middle-man
– Fit for mission-critical, embedded domain, e.g. space shuttle or
car control
• Hybrid approach
– Non-critical part in high-level language
– Critical part in assembly language
• Most compilers are good enough to convince that you do not need
to write assembly code for general-purpose applications
– Except embedded or IoT domain
19
Assembler
• Translates file of assembly
language statements into a file of
binary machine instructions and
binary data.
• Two main steps:
– Find memory address for symbols
(e.g. functions).
– Translate each assembly
statement by combining the
numeric equivalents of opcodes,
register specifiers, and labels into
a legal instruction
• Binary
• Produce object files
20
Object File
ELF Format: https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
#include <stdio.h>
int a[10]={0,1,2,3,4,5,6,7,8,9};
int b[10];
23
Inspect an ELF Object File or Executable
• Executable and Linkable Format (ELF)
– https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
• readelf and objdump command in Linux to inspect
object/executable file or disassembly
– Only objdump can do disassembly
24
Linking
• Linker (ld command) searches a collection of object files and
program libraries to find nonlocal routines used in a program,
combines them into a single executable file, and resolves
references between routines in different files.
25
Linking Multiple files to make executable file
• Two programs, prog1.c and prog2.c for one single task
– To make single executable file using following instructions
First, compile these two files with option "-c"
gcc -c prog1.c
gcc -c prog2.c
-c: Tells gcc to compile and assemble the code, but not link.
We get two files as output, prog1.o and prog2.o
Then, we can link these object files into single executable file
using below instruction.
gcc -o prog prog1.o prog2.o
Now, the output is prog executable file.
We can run our program using
./prog
26
Linking with other libraries
• Normally, compiler will read/link libraries from /usr/lib
directory to our program during compilation process.
– Library are precompiled object files
29
Compile in Multiple Steps
30
Try readelf
31
Try objdump for both object file and executable
32
“objdump -D” to disassembly: convert binary
object code back to symbolic assembly code
33
nm: list symbols from
object files
• T: define a symbol
• U: undefined symbol
– Linker to link
• Address are relative
34
Static Linking
• If multiple program want to use read_timer functions
– They all include the full definition in their source code
• Duplicate: If the function changes, we need to change each file
– Separate reader_timer in a new file, compile and statically linked
with other object files to create executables
• Duplicate the same object in multiple executables.
35
Static Library vs Shared (Dynamic) Library
• Static library needs to be duplicated in every executable
– Bigger code size, better optimized
• Shared library are loaded on the fly during the execution
– Smaller code size, performance hits of loading shared memory
• Combine both
36
Hands-On for dynamic linking
• Sum example for static and dynamic linking: from sum.c and
sum_full.c created in the last exercise,
– Create a new file read_timer.c that includes the read_timer and
read_timer_ms definition in the file
– Leave only the read_timer and read_timer_ms declaration in the
sum_full.c
• They are the interface of the two methods.
– Compile read_timer.c into a dynamic library
• The library name is my_read_timer, and the library file is
libmy_read_timer.so. You can choose any name.
– Compile sum.c and sum_full.c and link with lib my_read_timer
• gcc sum_full.c sum.c -o sum -L. -lmy_read_timer
– Use ldd command to list dependent libraries
37
Build Steps with Dynamic Library
39
Loading a File for Execution
• Steps:
– It reads the executable’s header to determine the
size of the text and data segments.
– It creates a new address space for the program. is
address space is large enough to hold the text and
data segments, along with a stack segment (see
Section A.5).
– It copies instructions and data from the executable
into the new address space.
– It copies arguments passed to the program onto
the stack.
– It initializes the machine registers. In general, most
registers are cleared, but the stack pointer must be
assigned the address of the rst free stack location
(see Section A.5).
– It jumps to a start-up routine that copies the
program’s arguments from the stack to registers
and calls the program’s main routine. If the main
routine returns, the start-up routine terminates
the program with the exit system call.
40
Memory Layout of A Process
ELF format of an executable
• ABI
#include <stdio.h>
43