0% found this document useful (0 votes)
608 views15 pages

LLVM Crash Course

A quick introduction to LLVM, its intermediate representations, and its passes, plus a few lessons learned from setting up LLVM.

Uploaded by

Lauren Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
608 views15 pages

LLVM Crash Course

A quick introduction to LLVM, its intermediate representations, and its passes, plus a few lessons learned from setting up LLVM.

Uploaded by

Lauren Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

LLVM Crash Course

Classical Compiler Approach


Lexing
Parsing
Type Checking

Time/Space
Improvements

Instruction Selection
Register Allocation
Instruction Scheduling

C, C++, Obj-C
Source Code

Frontend

Language-Specific
AST

Optimizer

Backend

Machine
Code

Fortran
Source Code

Frontend

Language-Specific
AST

Optimizer

Backend

Machine
Code

LLVM Approach
Your Code
(C, C++,
Obj-C)

Frontend
(Clang)

Your Code
(Python)

Frontend
(Python)

Your Code
(Fortran)

Frontend
(Fortran)

Common
Intermediate
Representation
LLVM
Optimizer

LLVM
Backend
(ARM)

Machine
Code
(ARM)

LLVM
Backend
(x86)

Machine
Code
(x86)

LLVM
Backend
(PowerPC)

Machine
Code
(PowerPC)

Intermediate representation (IR)


What uses IR?

LLVM compiles IR into native code (x86 assembly)


Compiler front-ends generate IR
You can write your own IR code

Why use IR?

Common representation for many programming languages


Good for optimizations!

C:

IR:

int main() {
int a = 10;
int b = a + 5;
return 0;
}

define i32 @main() #0 {


entry:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 0, i32* %retval
store i32 10, i32* %a, align 4
%0 = load i32* %a, align 4
%add = add nsw i32 %0, 5
store i32 %add, i32* %b, align 4
ret i32 0
}

C:

IR:

int main() {
int a = 10;
int b = a + 5;
return 0;
}

define i32 @main() #0 {


entry:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 0, i32* %retval
store i32 10, i32* %a, align 4
%0 = load i32* %a, align 4
%add = add nsw i32 %0, 5
store i32 %add, i32* %b, align 4
ret i32 0
}

C:

IR:

#include <stdio.h>

@.str = private unnamed_addr


constant [14 x i8] c"Hello
world!\0A\00", align 1

int main() {
printf(Hello
world!\n);
return 42;
}

define i32 @main() #0 {


entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%call = call i32 (i8*, ...)*
@printf(i8* getelementptr
inbounds ([14 x i8]* @.str,
i32 0, i32 0))
ret i32 42
}

C:

IR:

#include <stdio.h>

@.str = private unnamed_addr


constant [14 x i8] c"Hello
world!\0A\00", align 1

int main() {
printf(Hello
world!\n);
return 42;
}

define i32 @main() #0 {


entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%call = call i32 (i8*, ...)*
@printf(i8* getelementptr
inbounds ([14 x i8]* @.str,
i32 0, i32 0))
ret i32 42
}

Passes
Common
Intermediate
Representation
Your Code
(C, C++,
Obj-C)

Front End
(Clang)

LLVM
Optimizer

Passes are applied to IR

Back End
(LLVM)

Machine Code
(x86, ARM)

Passes
An LLVM pass is an operation on a piece of IR

Used on modules, functions, instructions, etc.

What can LLVM passes do?

Mutate/transform IR
Compute higher-order information about IR
Used for optimizations and analysis
Can depend on other passes

Transformation Passes

Passes mutate the IR directly


Clean up and canonicalize code from front-end
Optimize code
Examples:
Dead code elimination
Merging functions

Analysis Passes

Passes do not mutate the IR


Calculate information about the IR
Examples:
Instruction count
Alias analysis!

A Simple LLVM Analysis Pass


LLVM is a collection of libraries
Hello is a subclass of FunctionPass. It will operate
on each function in the source file
Create a unique ID for this pass
runOnFunction is an abstract virtual method in
FunctionPass. This is where the analysis code goes.
Initialization of the pass ID. The value does not
matter.

#include "llvm/Pass.h
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
namespace {
struct Hello : public FunctionPass {
static char ID;
Hello() : FunctionPass (ID) {}
bool runOnFunction (Function &F) override
{
errs() << "Hello: " ;
errs().write_escaped (F.getName()) <<
'\n';
return false;
}
};

Register the pass. It can be invoked on the


commandline via opt with hello.
http://llvm.org/docs/WritingAnLLVMPass.html

}
char Hello::ID = 0;
static RegisterPass <Hello> X("hello", "Hello
World Pass" , false, false );

Compiling and Running the Pass


Can be built either in the LLVM source tree (via Makefile) or against compiled
LLVM binaries (via CMake)
Compiles to a shared object
Invoked via the opt command
opt load hello.so hello < someCode.bc > /dev/null
Compiled pass

LLVM IR File to analyze

Pass to run (chosen in registration code)

Drop output, since this is an analysis pass

Lessons Learned Installing LLVM/Clang


Install Clang first, even if you dont plan to use it

CMake expects it and its dependencies to be present when building LLVM

Incompatibilities in Clang and the standard library

You may encounter a strange error: no member named gets in global namespace. Not all versions of GCCs std lib
implementation are compatible with all versions of Clang.

Always use CMake

Makefile support is minimal and documentation is often wrong

Always build from source


Repository binaries often lack dependencies needed when building a pass.

You might also like