SP Merged Slides

Download as pdf or txt
Download as pdf or txt
You are on page 1of 242

Systems Programming

Introduction to the Module


Yehia Elkhatib, Phil Trinder

Systems Programming
Chapter 01
1

Systems Programming (H)

Lecturers
• Yehia Elkhatib https://yelkhatib.github.io/
– Room: S322a, Sir Alwyn Williams Building
– Yehia.Elkhatib@glasgow.ac.uk

• Phil Trinder http://www.dcs.gla.ac.uk/~trinder/


– Room: S101c, Lilybank Gardens
– Phil.Trinder@glasgow.ac.uk

GTAs: James Bradford, Youssef Moawad, Vrinda Sharma, Robert Szafarczyk


Schedule
• Lecture: each Monday, either 10:00 or 11:00 (check your timetable)
• Lab: starting next week, Mondays at 13:00 / 14:00 / 15:00 / 16:00 in BOrr 720
Systems Programming
Chapter 01
2

Learning Outcomes

1. Demonstrate competence in low-level systems programming


2. Design, implement and explain the behaviour of low-level programs written in a
systems language
3. Explain the concepts of stack and heap allocation, pointers, and memory
ownership, including demonstrating an understanding of issues such as memory
leaks, buffer overflows, use-after-free, and race conditions
4. Explain how data structures are represented, and how this interacts with caching
and virtual memory, and to be able to demonstrate their performance impact
5. Discuss and reason about concurrency, race conditions, and the system
memory model
6. Build well-structured concurrent systems programs of moderate size, using
libraries and static analysis tools appropriately
In this course we will use C and C++ as they are the two most important systems
programming languages

Systems Programming
Course Overview and Schedule

• Full schedule on moodle page with instructions


• Schedule Highlights
– No lab this week, we start next week.
– No lecture or lab sheet in week 7 (beginning 31/10) as CW1 due
Use the lecture & lab time (with demonstrators) to work on CW1
– CW1 submission on 03/11 at 16:30
– Guest lecture on 29/11 (TBC)
– CW2 submission on 24/11 at 16:30
– Revision Lecture in Semester 2; Exam in April/May

Systems Programming
Course Overview and Schedule

• Introduction to C and Data Structures


Yehia Elkhatib
Lectures 1 to 5

• Introduction to Concurrent Programming


Phil Trinder
• Lectures 6 onward

Systems Programming
Assessment

• There will be two pieces of coursework worth 30%


– CW1 worth 20% of your mark (released end of next week)
– CW2 worth 10% of your mark (released mid November)
• Both CW parts will be marked using moderated peer-review
– More on this later in the semester
• The exam in April / May is worth 70% of your mark
• There is an unmarked lab exercise each week for improving your
C/C++ programming skills. You can also get assistance with your
coursework during the Lab and ask questions.

Systems Programming
Lectures

• Lecture delivery plan: MAY BE ADAPTED AS WE LEARN!


– 1 hour for lecture and Q&A
– Each lecture includes coding and other exercises

Systems Programming
Labs

– Work for 30-60 minutes on the lab sheet


– Can be done before the lab session to identify issues and
prepare questions
– Each group of students will be assigned a Tutor as the first
point of contact
lTutors will be announced by email / Teams

Systems Programming
Support & Communication

• Help us help you J by asking questions!


• We will find the answer together by trying things out and writing
programs
– Questions will be answered with live coding if possible

Lectu
READ res
Padlet Lab
tutor
MS Teams
channel
Teams personal
message or Email

– Only email us ONLY with questions regarding personal matters

Systems Programming
Expectations: You

• What we expect from you


– Attend lectures
– Take notes (slides are not enough when you revise)
– Complete tasks / quiz
– Read around the subject and try things for yourself
– Ask questions in lectures and labs. This is your chance J
– Master your calendar: plan your learning hours carefully

Systems Programming
Expectations: Us

• We’ll do our best to:


– Make lecture notes and videos available on moodle
– Give you follow up references and additional reading material
– Make sure labs are running smoothly with good GTA support
– Arrange extra support if normal route (padlet, GTAs, etc.) is
exhausted
– Respond to email on student-specific issues
l we only respond to UoG email addresses

Systems Programming
Reading Material

• No book required for this course (programming by yourself is the key to


understanding)
• Two introductory recommendations (left) and two defined treatments of C and
C++ (right):

• For basic and syntactic questions:


– use free C Programming Wikibook and cppreference.com
Systems Programming
What now?

• Attend today’s lecture


– Introduction to Systems Programming in C

• Complete the first quiz


– What languages do you speak? Experience and Expectations

• Reminder: No lab today

Systems Programming
Systems Programming
Introduction to C

© Yehia Elkhatib
Based partly on material by Anna Lito Michala

Systems Programming
What is Systems Programming?

• Systems programming is the activity of writing computer system software:


used as a platform for other software, a layer of abstraction (scaffolding)
i.e. the main ‘customer’ is other software, not necessarily users
• Examples of system software: operating system, drivers, compilers, …
• In contrast we call other software application software
• Usually, system software has particular performance constraints such as:
fast execution time,
low memory consumption,
low energy usage, ...
• To meet these constraints, systems programming languages allow for a more
fine grained control over the execution of programs

Systems Programming 1
Teaching Fundamentals, not a Language

• You will learn C and C++ as part of this course,


but this is a by-product and not the goal
• Th main goal: to deepen your understanding of
fundamental principles and techniques in computer systems
– Memory and Computation as fundamental resources of computing
– Representation of data structures in memory and the role of data types
– Techniques for management of computational resources
– Reasoning about concurrent systems

Understanding these principles and techniques well will make you a better
computer scientist no matter what area your are interested in.

Systems Programming
History of Systems Programming (1)

1940s – early 1960s:


System software was supplied by hardware
manufacturers and was used by all users.

A single application always used the entire


machine.

Early programming languages:


Fortran, LISP, COBOL

Grace Hopper wrote one of the


Grace Hopper at the UNIVAC I, the first
first compilers to automatically turn human commercial computer, circa 1960.
readable program into machine code.

Systems Programming
History of Systems Programming (2)

Until the 1970s: System software is written in processor-specific assembly languages


1970s: Ritchie and Thompson wanted to port UNIX v4 from the PDP-7 to the PDP-11

They looked for a portable programming language, and tried the B language but then
invented C as an imperative language supporting structured programming.

Ken Thompson (sitting) and Dennis Ritchie (standing) at a


PDP-7 minicomputer
PDP-11 minicomputer (Picture by Peter Hamer)

Systems Programming
History of Systems Programming (3)

1980s: Bjarne Stroustrup aims to enrich C with new abstraction mechanisms


creates C++

A major influence is the first object-oriented


programming language Simula

Bjarne Stroustrup, the creator of C++

2010s: New systems programming languages appear. Rust (2010) and Swift (2014) are
successful examples that include many functional programming language features

Systems Programming
Programming Paradigms

• In imperative programming computations are sequences of statements that change


the program's state

• Structured programming organises programs with subroutines and structured control


flow constructs

• Object-oriented programming organises programs into objects that contain data and
encapsulate behaviour

• In functional programming programs are mathematical functions and avoid explicit


change of state

Systems Programming
Some questions about
a simple Python program

• Q: What value does x hold at the end of the program execution? (not a tricky question)
– Answer: 42
• Q: How much memory does Python take to store x? (a tricky question)
– Answer: It depends on the Python implementation.
sys.getsizeof(x) gives you the answer. On my machine: 24 bytes (* 8 = 192 bits)
• Q: How many instructions does Python execute to compute x+1? (even more tricky)
– Answer: I don't know, but many more than just the addition ...
Python is dynamically typed, so the data type of x could change at any time.
Every operation tests the data types of the operands to check which instruction to
execute.
The C implementation for the add operation starts here

Systems Programming
Some questions about
a simple C program

• Q: What value does x have at the end of the program execution? (not a tricky question)
– Answer: 42
• Q: How much memory does C take to store x? (not that tricky)
– Answer: sizeof(int) usually 4 bytes (* 8 = 32 bits)
• Q: How many instructions does C take to compute x+1? (not that tricky)
– Answer: 1 add instruction and 2 memory (mov) instructions

This level of control allows strong reasoning about the program’s performance and
execution behaviour

Systems Programming
C by example: Fibonacci

Fibonacci recursively: Fibonacci iteratively:

Systems Programming
Compiling a C program

• To execute a C program we must first compile it into machine executable code


• Java, Haskell (and many others) are also compiled languages
• The compiler translates the source code in multiple steps into machine executable code:
1. The preprocessor expands macros
§ (e.g. #include <stdio.h> or #define PI 3.14)

2. In the compiling stage, the source code is a) parsed and turned into an intermediate
representation, b) machine-specific assembly code is generated, and, finally, c) machine
code is generated in an object file
3. The linker combines multiple object files into an executable
• In this course we are using the clang compiler.
• To compile and then execute a C program run:

Systems Programming
1. Preprocessing

Systems Programming
2a. Compiler intermediate
representation

Systems Programming
2b. Assembly code

Systems Programming
2c. Machine code

Systems Programming
Linking

• The linker combines one or more object files into a


single executable
• The linker checks that all functions called in the
program have machine code available (e.g. printf's
machine code is in the C standard library)
• If the machine code for a function can not be found
the linker complains:

Undefined symbols for architecture x86_64:


"_foo", referenced from:
_main in source.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1
(use -v to see invocation)

Systems Programming
Today’s task

• On moodle there are two examples of the Fibonacci


implementation (recursive and iterative)

1. Task 1: Download them, and run the examples using the


remote access system to the university servers.
– To access the university servers follow the instructions on
moodle

2. Task 2: Generate all the different representations from pre-


processing to object code.

Systems Programming
Next week: Basics of C

• Basics of C: variables, structs, control flow, functions


• What are data types, why are they important, and what
do they tell us about memory?
• If you know C already well you might skip the lecture
next week, but do come to the Lab!

Systems Programming
Systems Programming
The Basics of C

© Yehia Elkhatib
Based on material by Anna Lito Michala

Systems Programming
Questionnaire Results

Systems Programming
Questionnaire Results

Systems Programming
Questionnaire Results

Systems Programming
School Servers
• Only 3 servers are externally visible
• Files are automatically replicated across all
servers

<demo>

• The ssh command is:


ssh 1234567x@ssh1.dcs.gla.ac.uk
where "1234567x" is your GUID.
• The default password is the last 8 digits of
your student card number
• You will have access to your own home folder
See Troubleshooting Guide on Moodle
/users/level3/<GUID>
Systems Programming
Padlet
• https://padlet.com/yelkhatib/sph2022
• The best way to get questions answered. Please use it!
• Please add one question per post.

Systems Programming
Resources – don't forget to use them!
• man pages
• cppreference.com

Systems Programming
Overview
• Today we are going to discuss fundamental features of C
• We will first highlight some differences between Java and C,
but not discuss every syntactic construct in C
• Topics in focus:
– the role data types play in assisting us to write meaningful programs
– lexical scope
– lifetime of variables
– call-by-value
– the difference between declarations and definitions
• Some useful tools
– clang, and the meaning of some compiler warnings and errors
– Makefile
• Tasks
Systems Programming
Java vs C
• We start by looking at some Java and equivalent C programs
• These are taken from rosettacode.org which collects programs in many different
languages
1. Computing the dot product:
$

z = # 𝑥! % 𝑦!
!"#
public class DotProduct { #include <stdio.h>
public static void main(String[] args) { int main() {
double[] x = { 1, 3, -5}; double x[] = { 1, 3, -5};
double[] y = { 4, -2, -1}; double y[] = { 4, -2, -1};
double z = 0; double z = 0;
for (int i = 0; i < 3; i++) { for (int i = 0; i < 3; i++) {
z += x[i] * y[i]; z += x[i] * y[i];
} }
System.out.println(z); printf("%f\n", z);
} }
}

Systems Programming
Java vs C
2. Calculating the value of 𝑒
• The number 𝑒 is defined as the infinite series:
%
1 1 1 1 1
𝑒=# = + + + +⋯
𝑛! 1 1 1 % 2 1 % 2 % 3
$"#
public class CalculateE { #include <stdio.h>
public static final double EPSILON = 1.0e-15; #include <math.h>
public static void main(String[] args) { #define EPSILON 1.0e-15
long fact = 1; int main() {
double e = 2.0; long fact = 1;
int n = 2; double e = 2.0;
double e0; int n = 2;
do { double e0;
e0 = e; do {
fact *= n++; e0 = e;
e += 1.0 / fact; fact *= n++;
} while (Math.abs(e - e0) >= EPSILON); e += 1.0 / fact;
System.out.printf("e = %.15f\n", e); } while (fabs(e - e0) >=EPSILON);
} printf("e = %.15f\n", e);
} }

Systems Programming
Java vs C
• As seen with the examples Java and C look and feel very similar
(when we ignore the object-oriented features in Java)
• Java has adopted the same syntax style that C introduced earlier

• If you know Java, you can start writing C code straight away!

• We will learn the differences of how Java and C programs are executed and
how memory is organised
• In the rest of this lecture, we will look in more depth into C and start exploring
more differences with Java

Systems Programming
Task

• Pick one of the Java files in Moodle lec2 folder


• Without looking at the C code, try to write a C equivalent
• Check your answer against the C code in the same folder.
Note any differences. Was there something that you did not
translate correctly? Any counter-intuitive parts?
• Use your notes to discuss this further in the labs

Systems Programming
Compiler Warnings and Errors
• Always remember: The compiler is your friend!
• Errors and warnings are feedback from the compiler that there
is something wrong with your program
• A compiler error indicates that it is impossible to compile your
program. You must change your program to get it compiling
• A compiler warning indicates an unusual condition that may
(and quite often do) indicate a problem
You should change the program to: either fix the problem
or clarify the code to avoid the warning.
• There is no good reason to ignore warnings!
• The -Werror flag turns all warnings into errors making it impossible to compile a program with
warnings
• The -Wall flag enables most compiler warnings
• More info here: https://clang.llvm.org/docs/DiagnosticsReference.html
• Every warning and error that is caught by the compiler can lead to a problem at runtime!
• In this course we insist on using the -Wall -Werror flags (particular for your coursework!)

Systems Programming
Makefile
• GNU make is the most popular build system for C programs automating the process of
compiling programs
• Alternative build systems: Maven, Bazel, Ninja, and many more
• For make we write a Makefile that has the following basic structure:
program : source.c
clang -Wall -Werror source.c -o program For more details
# ^ this space must be a single tab character! man make

• The first two lines form a rule that explain how a target is built
• The first line is the dependency, made up of "[targets] : [sources]"
– The target(s) depend on the source(s); elements in each list are separated by a space
• The following line(s) are the action, i.e. commands defining how the target is built
– Need to start with 1 tab (Makefiles are white-space sensitive!)
• Running make will execute the first rule of the Makefile in the current directory
– make target will execute the rule to make the named target

Systems Programming
Task

• Check out the Makefile on Moodle


• Use the command line make
• See if your program compiles!

• Makefiles can be as simple or as complicated as you need


them to be. You can read more on them here if you are
interested (optional)

Systems Programming
Entry point for every C program
- main

• Every C program has exactly one main function which is the entry point into the program
#include <stdio.h>
int main() {
printf("Hello world!\n");
}

• The main function is special in the sense that if the terminal } is reached 0 is
automatically returned
• A return value of 0 indicates a successful execution to the environment executing the
program
• A non-negative return value indicates an unsuccessful execution
• Two valid versions of main:
int main() { ... }
int main(int argc, char* argv[]) { ... }

• The second version allows to process command line arguments


We will learn how to understand the signature next week when we learn about pointers

Systems Programming
Basic output with printf
• The printf function is defined in stdio.h and allows the formatted printing of values
• The first argument is the format string containing special characters indicating the
formatting of values
• Starting from the second arguments are the values to be printed
• The number and order of special characters and values has to match

Special
Explanation Argument Type
Characters
%c Single character char
%s Character string String: (char *)
%d signed integer in decimal representation int
%f floating-point number in decimal representation float double
Full list of special characters at: https://en.cppreference.com/w/c/io/fprintf

printf("%c %s", 'a', "string"); // prints: "a string"


printf("%f - %f = %d", 3.14, 1.14, 2); // prints: "3.140000 - 1.140000 = 2"

Systems Programming
Variables
• Variables are syntactically used exactly like in Java, e.g. int x = 4; x++;
• Variable definitions contain first the data type, then the name (or identifier) and
an initialization expression
int x = 4;
• Why do we need to declare the data type for every variable?
• Python doesn't require us to define data types:
x =4
x = x + 1
• C is a statically typed programming language where every expression has to
have a data type which is known without running the program

But what are data types for anyway?


Systems Programming
Representation of values
and Data Types

• What is the meaning of the following bit-pattern?


1000 0001
• Or more precisely, what could this bit-pattern mean?
– maybe: 129 if we want to represent a unsigned 8-bit long integer value
– maybe: -127 if we want to represent a signed 8-bit long integer value with two's complement
– maybe: -1 if we want to represent a 8-bit Minifloat value
– but also maybe it means the colour blue? or an ASCII character? ...

• A bit-pattern has no meaning on its own!


We, the human programmers, give bit-patterns meaning!
• By declaring a variable with a data type (such as int) we decide what the bit-pattern in
memory means
Data types give bits meaning
That's why they are so important!

Systems Programming
Preserving meaning
thanks to data types

• To assist us writing meaningful programs the compiler enforces that computations


preserve the meaningful representation of our data
• For example: for x+1 the compiler ensures that a meaningful addition of the value one
and x is performed.
When x has the data type char and the value 42 this will modify the bit-pattern like this:
0010 1010 + 0000 0001 = 0010 1011

When x has the data type float and the value 42 this will modify the bit-pattern like this:
0 10000100 01010000000000000000000
+ 0 01111111 00000000000000000000000
====================================
0 10000100 01011000000000000000000

• By enforcing operations to respect the data types the compiler prevents meaningless
computations:
error: invalid operands to binary + (have ‘float’ and ‘char *’)
x = x + "1";
^

Systems Programming
Representation of
C variables in memory

• Every variable in C is stored at a constant location in memory that does not change over
its lifetime
• In C the data type of a variable determines its representation in memory:
int x = 42; // represented as: 0000 0000 0000 0000 0000 0000 0010 1010
float f = 42; // represented as: 0100 0010 0010 1000 0000 0000 0000 0000

• By choosing a particular integer data type we are in control how much memory we use!

Type name Typical size in bytes Value range

char / unsigned char 1 byte [-127, +127] / [0, 255]


short / unsigned short 2 bytes [-32767, +32767] / [0, 65535]
int / unsigned int 4 bytes [-2147483647, +2147483647] / [0, 4294967295]
long / unsigned long 4 or 8 bytes at least as for int
long long / unsigned long long 8 bytes −2!" − 1, +2!" − 1 /[0, 2!# − 1]

• For example, to store 1 million temperature measurements in Celsius:


choosing an array of char values over an array of int values safes almost 3 MB of data
this is almost the entire size of my processor's L3 cache!

Systems Programming
Boolean values
• We have now seen multiple data types for representing integer and floating point values
• What about the boolean values true and false?
• In C every integer can be interpreted as a boolean, where 0 represents false and any
other value true: int i = 5; prints 4
while (i) { 3
i = i - 1; 2
printf("%d\n", i); 1
} 0
printf("done\n"); done

• Since the C99 standard it is also possible to use the #include <stdbool.h> header:
#include <stdbool.h>
int main() { bool a = true; bool b = false; bool c = a && b; printf("%d\n",
c); }

• In stdbool.h true and false are declared as macros with the values 1 and 0
– all small letters

Systems Programming
Lexical Scoping
• Each pair of curly braces { } is called a block in C and introduces a lexical scope
• Variable names must be unique in the same lexical scope
• For multiple variables with the same name, the variable declared in the innermost
scope is used.

Which values will be printed?


int main() { int foo(int i) {
int i = 5; int j = i;
{ {
int j = foo(i); int j = i + 2;
printf("%d\n", j); printf("%d\n", j);
} }
} return j + 1;
}

First 7 (from inside foo) and then 6 (from inside main) will be printed

Systems Programming
Variable Lifetime
• Variables are stored at locations in memory that do not change over their lifetimes
• But what is the lifetime of a variable? This depends on how the memory for the variable
was allocated. There are three cases:
1. automatic: These are all variables declared locally in a block (i.e. inside a pair of {}) and their
lifetime ends at the end of the block. All variables we have seen so far fall into this category.
int main() {
int x = 42;
} // end of the block - end of the lifetime of x

2. static: Variables declared with the static keyword or defined at file-level outside all blocks. The
lifetime is the entire execution of the program.
int foo() {
static int count_calls_to_foo = 0; // the variable is initialized only once
count_calls_to_foo++;
} // variable continues to live

3. allocated: These are variables for which we must explicitly request memory using dynamic
memory allocation functions (such as malloc). We manage the lifetime of these variables
ourselves (next lecture!).

Systems Programming
Stack-based Memory Management
• When the lifetime of an automatically managed variable ends, its memory location is
freed and can be reused by other variables
• This memory management happens fully automatically as it is very easy to implement:
– every time a block is entered, put aside a location in memory for every variable
declared in the block
– every time a block is exited, free the locations in memory for every variable
declared in the block
• As this memory management strategy adds and removes memory locations in a
last-in-first-out manner we call this stack-based memory management and the area of
memory managed in this way the stack

• We will learn in the next two weeks the other important area of memory called
the heap which is managed manually by the programmer

Systems Programming
Types combining multiple elements
-Struct
• So far we have only seen variables with basic types.
There are two important types which combine multiple elements: arrays and structs
• A struct consists of a sequence of members of (potentially) different types:
struct point {
int x;
int y;
};
int main() {
struct point p = {1, 2};
printf("x = %d\ny = %d\n", p.x, p.y);
}

• Members are accessed like public class members in Java using the . notation
• Members are stored next to each other in memory in the same order as defined in
the struct
• The type of a struct is written struct name, but we often use this trick to shorten it:
typedef struct { int x; int y; } point;
int main() { point p = {1, 2}; /* ... */ }

Systems Programming
Types combining multiple elements
-Array
• Arrays consist of a multiple elements of the same type:

int main() {
int x[2] = {1, 2};
printf("x[0] = %d\nx[1] = %d\n", x[0], x[1]);
}

• Arrays which are stored on the stack must have a fixed size (so that the memory is
automatically managed)
• We will learn how to use arrays for which the size can change next week
• The elements on an array are stored next to each other in memory

Systems Programming
Types combining multiple elements
-String
• Strings in C are represented as arrays of characters
char greeting[] = "Hello World";

is the same as
char greeting[] = {'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '\0'};

• We use double quotes " to write a (ASCII) string literal and single quotes ' to write a
character literal
• Strings in C are terminated by the special character '\0' that is added automatically for
string literals
• To print a string with printf we use the %s formation character
printf("%s\n", greeting);

• If we forget the terminating '\0' character this will print the content of the memory
until it hits the next bit pattern equivalent of '\0'!

Systems Programming
Functions
-Definition
• Functions are probably the most important abstraction mechanism in computing science
They allow us to write code in a modular fashion which we can then reuse
• A function definition in C looks like this:
int max(int lhs, int rhs) {
if (lhs > rhs) { return lhs; } else { return rhs; }
}

• The return type specifies the data type of the value that will be returned after evaluating
the function. If a function returns no value the special type void is used as return type
• The name which should describe the behaviour of the function
• A parameter list specifies the data type and name of each parameter expected by the
function
• The function body is a block containing the code executed when calling the function
• To call a function we provide an argument for each parameter and capture/process the
return value
int x = max(5, 3);

Systems Programming
Functions
-Declaration vs. Definition
• A function definition (previous slide) fully specifies the behaviour of the function
• A function declaration only specifies the interface describing how a function can be used:
int max(int lhs, int rhs);

• Function declarations are important for writing modular software as it allows to separate
the interface (the declaration) from the implementation (the definition)
• For calling a function (e.g. max or printf) the compiler checks that the data types of the
call expression and the function declaration match:
int max(int lhs, int rhs);
int main() { int x = max(4.5, 'b'); }
max.c:6:17: error: implicit conversion from 'double' to 'int' changes value from 4.5 to 4 [-Werror,-Wliteral-
conversion]
int x = max(4.5, 'b’);
~~~ ^~~

• The linker searches for the definition which might be in a different file or library (as
for printf)

Systems Programming
Functions
- Call-by-value
• When we call a function how are the arguments passed to it?
e.g. what happens to a variable that is passed to a function which modifies its
parameters inside its body?
void set_to_zero(int x) { x = 0; }
int main() {
int y = 42;
set_to_zero(y);
printf("%d\n", y); // what will be printed?
}

• The value 42 will be printed, because when we call a function we pass a copy of the
arguments to it
• That means x and y in the example above are stored at different locations in the memory
When set_to_zero is called the value of y is copied into the memory location of x
• We call this call-by-value as the argument value is evaluated and stored in the parameter
variable
void foo(int x)
int main() { foo( 5 + 4 ); /* first evaluate 5 + 4 to 9 and then store it in x */ }

Systems Programming
Functions
- Call-by-value - Structs and Arrays
• Functions can accept structs and arrays as parameters:
typedef struct { int x; int y; } point;

int search(point needle, point haystack[], int haystack_size) {


for (int i = 0; i < haystack_size; i++) {
point candidate = haystack[i];
if (needle.x == candidate.x && needle.y == candidate.y) { return i; }
}
return -1;
}

• structs behave like the basic types and are passed-by-value


Changes to the struct ’s members are therefore not visible outside the function
• arrays are treated slightly specially:
– instead of copying the array (very expensive), the address of the first element is passed
– because of this, changes to the array elements are now visible outside the function
• The special treatment of arrays will make more sense when we learn
about pointers next week

Systems Programming
Task

• Give the example programs on Moodle a try


• Try to implement a struct and using it as a parameter for a
function (see previous slide)
• Alter the struct element values in the function and print the
struct after the function call. See if your changes were
persistent.

Systems Programming
Next week

• Next week we are going to deepen our understanding of


memory
• We will learn what pointers are and why they are
fundamental in understanding C and memory
• We will learn the first steps of using dynamic memory
allocation on the heap using malloc

Systems Programming
Systems Programming
Memory & Pointers

Yehia Elkhatib
Based on material by Anna Lito Michala

Systems Programming
Housekeeping
You’ve seen how the gateway can be affected with excessive load
• If you are going to edit code remotely (e.g. using VS Code)
– login to ssh1/sibu
– push files to ssh1/sibu
– compile ONLY on stlinux servers
• If you will edit code directly on the servers
– login to ssh1/sibu
– login to an stlinux server
– edit and compile on an stlinux server

• Make sure you leave enough time for testing your code on the stlinux servers
– do not leave it till the last hour and complain that you cannot access the servers

Systems Programming 1
Housekeeping
• On Moodle:
– Solutions to last week’s lab
– CW1 specification
– Document explaining BSTs (and in this lecture)

Systems Programming 2
Overview

• Memory
• Pointers
• Stack vs Heap
• Dynamic Memory Allocation

Systems Programming
What is memory?
From the Oxford English Dictionary:
memory | ˈmɛm(ə)ri |
noun
1. the faculty by which the mind stores and remembers information
2. something remembered from the past
3. the part of a computer in which data or program instructions can be stored
for retrieval.

Systems Programming 4
How should we think about memory?

We can think of memory as a sorting cabinet where each box stores the value of a variable
The variable name is the label which allows us to remember where we stored what

Systems Programming
How should we think about memory?
• We can also think of memory as a single long street where each house has a
unique address
• We have some notion of spatial locality: Houses close to each other are neighbours, but
there are also houses far away at the other end of the street

Systems Programming
Computer Memory in Programming
• We can think of memory as a very large one-dimensional array (or a large buffer)
where instead of indices into the array we use names to identify variables
• These two programs behave exactly the same:

int main() { int main() {


int memory[3];
int x = 42; memory[0] = 42;
int y = 23; memory[1] = 23;
int z = x * y; memory[2] = memory[0] * memory[1];
} }

• …but their memory allocation is slightly different


– acquired block by block vs. acquired as a sequence of blocks at once

Systems Programming
Byte Addressable Memory
• Every byte (a collection of 8 bits) in memory has its own address
• On a 64-bit architecture these addresses are 64-bits (or 8 bytes) long
• In theory, a 64-bit architecture can address up to 264 bytes = 16 exabytes
• In practice, x86-64 only uses the lower 48 bits of an address,
supporting up to 248 bytes = 256 TB

• To modify a single bit we always have to load and store an entire byte:

int main() {
unsigned char byte = 0;
byte = byte | (1 << 3); // set the 3rd bit to 1
printf("%d\n", byte); // prints 8 (== 2^3)
}

Systems Programming
• We need 3 things:
– get the memory address of a variable (i.e. pointer)
– pass pointers (e.g. to functions) for manipulation
– set a value at a pointer

Systems Programming
Variables in Memory
• As we have learned last week:
Every variable in C is stored at a memory location that does not change over its lifetime
• The location of a variable identifiable by its address
• Depending on the size of the data type, the value of the variables will span multiple
bytes in memory
• We can ask the address of a variable in C using the address-of operator &:
int main() {
int x = 42;
int y = 23;
printf("&x = %p\n", &x); // print the address of x
printf("&y = %p\n", &y); // print the address of y
}

• We can ask the size of a variable using the sizeof operator:


In the example sizeof(x) prints 4 as an int value is represented by 4 bytes

Systems Programming
Pointers
• We store the address of a variable as the value of another variable that we call a pointer
int main() {
int x = 42;
int * pointer_to_x = &x; // this is a pointer referring to x
printf("value of pointer_to_x: %p\n", pointer_to_x); // prints 0x77...

• The dereference operator * allows us to access the value of the variable we are
pointing to:

printf("value of x: %d\n", *pointer_to_x); // prints 42


}

• A pointer to a variable of data type t has the data type t *


• Every pointer has the same size (independent of the type it is pointing to) because it
stores an address
• Pointers on a 64-bit architecture are 8 bytes (or 64 bits)

Systems Programming
Pointers are normal variables
• A pointer is a variable like any other, that means:
• The pointer is stored at its own location
int x = 42; // stored at 0x7ffeedbed3dc
int * ptr = &x; // stored at 0x7ffeedbed3d0

• We can take the address of where the pointer is stored using the address operator:
printf("%p\n", &ptr); // prints 0x7ffeedbed3d0

• We can store the address of a pointer in another pointer:


int * * ptr_to_ptr = &ptr; // stored at 0x7ffeed7ed3c8

• We can change were a pointer points to:


int y = 23; // stored at 0x7ffeebaf23c4
ptr = &y;

Systems Programming
Call-by-value Revisited
• We learned last week that arguments are passed by-value to functions
• That means that the value of the argument is copied into the parameter variable
• This is also true for pointers
• Arrays are treated specially and a pointer to the fist element is copied instead of the
entire array
float average(float array[], int size) {
float sum = 0.0f;
for (int i = 0; i < size; i++) { sum += array[i]; }
return sum / size;
}

• The array is treated like a pointer, in fact int param[] and int * param are
interchangeable:
float average(float array[], int size);
float average(float * array, int size);

Systems Programming
Pointers and NULL
• Sometimes there is no meaningful value for a pointer at a certain time
• We use the value 0 or the macro NULL to represent a pointer that points to nothing
• NULL often represents an erroneous state, e.g. an element was not found in an array
// return pointer to value found in array
// return NULL if value not found in array
float* search(float needle, float haystack[], int haystack_size) {
for (int i = 0; i < haystack_size; i++)
if (needle == haystack[i]) { return &haystack[i]; }
return NULL;
}

• Dereferencing a NULL pointer will crash your program!


This has led to many software bugs.
The inventor of NULL, Tony Hoare, has called it
his billion-dollar mistake

Systems Programming
Pointers and const
• In C every variable can be annotated with the type qualifier const to indicate that the
content of the variable can not be changed
• This is enforced by the compiler:
const_error.c:4:6: error: cannot assign to variable 'pi' with const-qualified
type 'const float’
pi = 2.5;
~~ ^
const_error.c:3:15: note: variable 'pi' declared const here
const float pi = 3.14;
~~~~~~~~~~~~^~~~~~~~~

• Pointers can be const, i.e. unmodifiable, in three ways:


1. The pointer itself, i.e. the address, can not be changed: float * const ptr;
2. The value we are pointing to can not be changed: const float * ptr;
3. Both value and pointer can not be changed: const float * const ptr;

Systems Programming
Pointers and Arrays
• Pointers and arrays are closely related
• The name of an array refers to the address of the first element when assigned to a
pointer variable:
int vector[6] = {1, 2, 3, 4, 5, 6};
int * ptr = vector; // this is equivalent to: int * ptr = &(vector[0]);

• We can use the array indexing notation on pointers:


printf("5th element: %d\n", ptr[4]); // prints "5th element: 5"

• The expressions ptr[i] and *(ptr + i) are equivalent


• Two important differences:
– sizeof returns different values (size of array vs. size of pointer):
printf("%ld\n", sizeof(vector)); // prints '24' (== 6 * 4 bytes)
printf("%ld\n", sizeof(ptr)); // prints '8' (size of a pointer)

– we can not change a vector, only its elements:


vector = another_vector; // error: array type 'int [6]' is not assignable

Systems Programming
Pointer Arithmetic
• We can use pointer arithmetic to modify the value of a pointer. We can:
1. Add and subtract integer values to/from a pointer
2. subtract two pointers from each other
3. compare pointers
int vector[6] = {1, 2, 3, 4, 5, 6};
int * ptr = vector; // start at the beginning
while (ptr <= &(vector[5])) {
printf("%d ", *ptr); // print the element in the array
ptr++; // go to the next element
}

• Pointer arithmetic takes into account the size of the type the pointer is pointing to:
int * i_ptr = &i;
char* c_ptr = &c;
i_ptr++; // this adds 4-bytes (1x sizeof(int)) to the address stored in i_ptr
c_ptr+=2; // this adds 2-bytes (2x sizeof(char)) to the address stored in c_ptr

Systems Programming
Linked List with pointers and structs
• Pointers and structs are useful in building data structures
– as an example we look at a (single) linked list
• A linked list comprises of nodes with a value and a pointer to the next node:
struct node { char value; struct node * next; };

• The last node in the list has a next-pointer to NULL


int main() {
struct node c = {'c', NULL};
struct node b = {'b', &c};
struct node a = {'a', &b};
• We use a pointer to iterate over the linked list:
struct node * ptr = &a;
while (ptr) {
printf("%d\n", (*ptr).value);
ptr = (*ptr).next;
}
}

Systems Programming
Binary (Search) Tree
struct node {
char value;
struct node * left_child;
struct node * right_child;
};

• If we keep the binary tree ordered (thus it forms a binary search tree) we can search
efficiently in it: node * find(char value , node * root) {
if (value == root->value) {
return root;
}
if (value < root->value && root->left_child != NULL) {
return find(value, root->left_child);
}
if (value > root->value && root->right_child != NULL) {
return find(value, root->right_child);
}
return NULL;
}

• s_ptr->m is a notation to access member m of a struct-pointer s_ptr


and is equivalent to (*s_ptr).m
Systems Programming
main with command line arguments
• This allows us to understand the definition of the main function that processes command
line arguments

int main(int argc, char* argv[]) { ... }

• The argc argument specifies the number of command line arguments


• The argv argument specifies an array of the command line arguments as strings
– A single string is represented as an array of characters: char *
– The type of argv char * [] can also be written char * *

#include <stdio.h>
int main(int argc, char * argv[]) {
// print every command line argument
for (int i = 0; i < argc; i++)
printf("%s\n", argv[i]);
}

Systems Programming
Writing Generic Code with void *
• Sometimes we want to write code in a generic way so that it works will all data types
e.g. swapping two variables or sorting algorithms
• To swap two variables x and y of arbitrary type we copy all bytes at the location
of x to y and vice versa
• For this we write a function swap that takes two pointers and number of bytes that
have to be swapped: void swap(void *x, void *y, size_t l) {
char *a = x, *b = y, tmp;
while(l--) {
tmp = *a;
*a++ = *b;
*b++ = tmp; }
}

• The special pointer of type void * is a generic pointer. Every pointer is automatically
convertible to it
– A void-pointer only serves as an address pointing to something 🤷
• We can not access the value we are pointing to as we don't know what those bits
mean. Hence, dereferencing a void pointer is forbidden.
Systems Programming
Systems Programming
Stack vs. Heap memory regions
• So far we have only seen variables with automatic lifetime
managed by the compiler
• These variables are stored the stack part of the memory
– The size of every variable on the stack has to be known
statically, i.e. without executing the program
• For many important use cases we don't know the size of a
variable statically, e.g. dynamically sized arrays
• For such cases we manage the memory manually by dynamically
requesting and freeing memory
• The part of the memory used for dynamic memory allocation is
called the heap
• The stack and the heap share the same address space and grow
with use towards each other stack and heap are in a
single address space

Systems Programming
Dynamic memory management
• We request a new chunk of memory from the heap using the malloc function:
void* malloc( size_t size ); // defined in <stdlib.h>

• We specify the number of bytes we like to allocate


• If malloc succeeds, it returns a void-pointer to the first byte of the un-initialised block
• If malloc fails, NULL is returned
– It’s good practice to check for this
• Memory allocated with malloc must be manually deallocated by calling free (exactly
once):
void free( void* ptr ); // defined in <stdlib.h>

• If free is not called, we leak the allocated memory


– i.e. memory blocks are allocated but not used
• It’s good practice to set a pointer to NULL after freeing it

Systems Programming
Dynamically sized array example
• We use malloc to implement dynamically sized arrays
• Here the size of the array depends on the first number entered by the user
#include <stdlib.h>
#include <stdio.h>
int main() {
printf("How many numbers do you want to average?\n");
int count;
if (scanf("%d", &count) == EOF) { exit(-1); }
// allocate memory based on dynamic input (here its size)
int* array = malloc(count * sizeof(int));
for (int i = 0; i < count; i++) {
int number;
if (scanf("%d", &number) == EOF) { exit(-1); }
array[i] = number;
}
float sum = 0.0f;
for (int i = 0; i < count; i++) { sum += array[i]; }
printf("The average is %.2f\n", sum / count);
free(array); array=NULL; // free the memory manually after use
}

Systems Programming
Linked list extended example
• Using malloc we are now able to define functions for creating, extending, freeing lists:
struct node { char value; struct node * next; }; // same as before
struct node* create_node(char value) {
struct node * node_ptr = malloc(sizeof(struct node));
if (node_ptr==NULL) return NULL;
node_ptr->value = value; node_ptr->next = NULL;
return node_ptr;
}
void add_node_to_list(struct node* list, struct node* node) {
if (!list) return;
while (list->next) { list = list->next; }
list->next = node;
}
void free_list(struct node* list) {
if (!list) return;
while (list->next) {
struct node* head = list;
list = list->next;
free(head); head=NULL;
}
}
Systems Programming
Returning a pointer to a local variable
• It is an easy mistake to return a pointer to a local variable. Never do it!
– Why?
• By returning a pointer to a local variable the pointer has a longer lifetime than the
variable it is (was) pointing to
struct node* create_node(char value) {
struct node node;
node.value = value;
node.next = NULL;
return &node;
} // lifetime of node ends here...
// but its address lives on in a_ptr
int main() {
struct node* a_ptr = create_node(‘a’);
// ...
} // lifetime of a_ptr (pointing to node) ends here

Pass the root/leading node, then manipulate it inside the function before node expires
OR allocate the memory and pass the pointer back

Systems Programming
Task
To practice some of what we learned today try to write this very short program:
• T1: ask the user to pass in an integer as a command line argument
• T2: use this integer to create a dynamically allocated array of that size
• T3: add values 0 to size-1 as elements in the newly formed array
• T4: for every element of the array, print the element’s address (pointer) and value
» Eg of first output line: 000000ff 0
• T5: change the for loop so that you use pointer arithmetic and see if your output is
the same!

Systems Programming
Next week
• We are going to look at resource management and ownership
• We will particularly focus on memory management

Systems Programming
Systems Programming
Debugging and Development Tools

Yehia Elkhatib
Based on material by Anna Lito Michala

Systems Programming
Housekeeping
• stlinux0* access
• Make your coursework submission anonymous
• No extensions
• Aropa info later

• How’s it going?
– Please feed back to me your concerns and comments, both +ve and -ve
– I can work on addressing them now, as opposed as to later
– Contact me directly, or go through the class reps

Systems Programming
Overview

• Function Pointers
• Bugs
• Debugging
• Static Analysis Tools
• Dynamic Analysis Tools

• Peer Review (CW1b)

Systems Programming
Function Pointers
• Memory does not store data but also program code. It is possible to have a pointer
pointing to code.
• These pointers are called function pointers and have the type: (visitor design pattern)
return_type (*)(argument_types)
• Now we can implement a generic print function for a binary tree:
void print_string(const void * value_ptr) {
char * string = value_ptr; // by changing the type we give the bits meaning
printf("%s\n", string);
}
void print(node * root, void (* print_function)(const void *) ) {
if (root) {
print_function(root->value_ptr);
print(root->left_child, print_function);
print(root->right_child, print_function); }
}
int main() {
node * root = ... ;
print(root, print_string);
}
• Function names are automatically converted to function pointers, i.e. we don't have to
write &print_string Systems Programming
Bugs and finding them
• Software development is not free of errors
• We call these software errors bugs
– syntactical – grammar => compiler
– semantic – meaning => you
• We will enumerate a number of tools to
help us identify and understand bugs:
– Debuggers: run a program in a
controlled environment where we can
investigate its execution
– Static analysis tools: reason about
a program's behaviour without running it
– Dynamic analysis tools: add
Logbook of Grace Hoppers team in 1947
instructions to a program to detect bugs
at runtime

Systems Programming
Debugger
• The two popular debuggers for C/C++ programs are GDB and LLDB
– very similar to each other
1. We need to compile the program with the -g flag to add debug information into the binary
2. Instead of executing the program normally we start debugger and load the program:
$ ./program arg1 arg2 ; "Run program normally"
$ gdb --arg ./program arg1 arg2 ; "Load program in GDB debugger"
$ lldb -- ./program arg1 arg2 ; "Load program in LLDB debugger"

3. Inside the debugger we start the execution with the run command
(lldb) b n s bt list

• A full list of LLDB commands:


https://lldb.llvm.org/lldb-gdb.html
https://lldb.llvm.org/use/tutorial.html
• To exit, run the quit command inside the debugger

Systems Programming
Segmentation fault
• This runtime error message is one of the most common in systems programming:
[1] 83437 segmentation fault ./program

• A segmentation fault (segfault) is raised by the hardware notifying the operating


system that your program has attempted to access a restricted area of memory
• The operating system will then immediately terminate your program
– it is possible to catch the segmentation fault signal and handle it (see OS module)
• The most common causes for segfaults are:
– Dangling pointer
– Dereferencing a NULL pointer
– Writing to read-only memory
– A buffer overflow
– A stack/heap overflow

Systems Programming
Common Causes of SegFaults
• Dangling pointer
– a reference to memory which has been de-allocated then allocated to another var
– if you NULL after freeing, it is easier to check if this pointer is “active”
• Derefrencing a NULL pointer
– NULL is an invalid memory location
• Buffer overflow
– accessing memory outside allocated bounds, typically with arrays
– you can read the values in these extra locations, but this data would be corrupted
• Stack overflow
– trying to write more data than your allocated memory
– often triggered by a recursion without a base case
• Heap overflow
– Memory leaks

Systems Programming
Where did my
segmentation fault come from?

• To find the line that triggered a segfault, we load and run the program in the debugger:
$ lldb -- ./program 12345
(lldb) run
Process 85058 launched: './program' (x86_64)
2018-10-21 20:56:00.106714+0100 program[85058:12827554] detected buffer overflow
Process 85058 stopped
...

• The debugger has now stopped the execution.


Using the bt (backtrace) command we investigate the calls leading to the segfault:
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00007fff76d1ab86 libsystem_kernel.dylib`__pthread_kill + 10
...
frame #6: 0x00007fff76ca8e84 libsystem_c.dylib`__strcpy_chk + 8
frame #7: 0x0000000100000eaf program overflow(argc=2, argv=0x00007ffeefbff300) at
program.c:35
frame #8: 0x0000000100000efb program`main(argc=2, argv=0x00007ffeefbff3c8) at program.c:42
frame #9: 0x00007fff76bdc085 libdyld.dylib`start + 1
frame #10: 0x00007fff76bdc085 libdyld.dylib`start + 1
• Here frame 7 shows us the file (program.c) and line (35) that triggered the segfault

Systems Programming
Breakpoints and GUI for debugging

• Breakpoints are points at which the execution is stopped so we can investigate the
state of the execution
• Setting breakpoints in the command line debugger is tedious (but possible)
• The Visual Studio Code editor (and others, like Atom) provide a GUI for GDB / LLDB

Detailed instructions for the configuration can be found at:


https://code.visualstudio.com/docs/editor/debugging
https://code.visualstudio.com/docs/cpp/config-linux
Systems Programming
Static Analysis
• A static analysis reasons about the code without executing it
• The compiler performs some static analysis every time you compile your code,
e.g. type checking
• It is good practice to enable all warnings (-Wall) and make warnings errors (-Werror)
This way the compiler can be most helpful in detecting bugs before execution
• Some static analysis is too expensive to perform in every build
Other static analysis enforces a particular coding guideline
• These types of static analysis are available with special flags or in separate tools
• We are going to use the tools provided by the clang compiler
• We invoke the static analyzer using a flag and specifying the output format of the
report:
clang --analyze --analyzer-output html program.c

Systems Programming
clang static analyzer report
• The tool generates a report explaining the potential bug

• Here the tool has reported a "garbage value" which in this case leads to undefined
outcome for an if-branch
Systems Programming
clang-tidy
• A linter (the name comes from the first UNIX tool to perform static analysis on C)
• clang-tidy is a flexible tool that allows to enforce coding guidelines and to modernize
source code. It is possible to extend clang-tidy by writing your own checks
• It is invoked like clang, accepting the same flags but after two dashes: --
• A series of checks can be en- and disabled. Here we enable the checks for readability:
$ clang-tidy -checks="readability-*" 6.c -- -Wall -Werror
/Users/lito/Desktop/examples/6.c:2:16: warning: pointer parameter 'p' can be pointer to const
[readability-non-const-parameter]
void test(int *p) {
~~~ ^
const
/Users/lito/Desktop/examples/6.c:4:9: warning: statement should be inside braces
[readability-braces-around-statements]
if (p)
^
{

– It suggests to to use const and put braces around the branch of an if statement
• Detailed information is available at: http://clang.llvm.org/extra/clang-tidy/

Systems Programming
Use the analyzer results
to improve your code

Systems Programming
Dynamic Analysis Tools
• There exists a family of bug detection tools that use dynamic analysis
• These tools need the program to run and can only detect bugs which are
encountered during the execution of a particular test input
• The clang project calls these tools sanitizers. The most important are:
– AddressSanitizer - a memory error detector
– MemorySanitizer - a detector of uninitialized reads
– LeakSanitizer - a memory leak detector
– UndefinedBehaviorSanitizer - a detector of undefined behaviour
• Later in the course, we will use:
– ThreadSanitizer - a data race detector

Systems Programming
Address Sanitizer
• Address Sanitizer is a memory error detector for:
– Out-of-bounds accesses
– Use-after-free
– Double free
• This makes clang insert instructions in your program to monitor every memory access
• This slows down the execution by about 2x
– valgrind (a similar tool) has often a slowdown of 20-100x!
• These flags enable address sanitizer:
clang -fsanitize=address -fno-omit-frame-pointer -O1 -g -Wall -Werror program.c -o program
– fno-omit-frame-pointer produces a readable call stack
– O1 enables basic optimizations
• The compiler will produce a binary as usual: ./program
• Address Sanitizer has found hundreds of bugs in large scale software projects
– e.g. Chromium and Firefox

Systems Programming
Address Sanitizer output
• The output reports a heap-use-after-free on address 0x614000000044 and provides
information where the memory was freed (line 5) and allocated (line 4)

Systems Programming
Memory Sanitizer
• Memory sanitizer detects uninitialized reads to memory:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char** argv) {
int* a = malloc(sizeof(int)*10);
a[5] = 0;
if (a[argc])
printf("xx\n");
return 0;
}

% clang -fsanitize=memory -fno-omit-frame-pointer -g -O2 umr.cc


% ./a.out
WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f45944b418a in main umr.c:6
#1 0x7f45938b676c in __libc_start_main libc-start.c:226

• This tool is under active development and currently only available for Linux and BSD

Systems Programming
Leak Sanitizer
• Leak sanitizer detects memory leaks
– i.e. memory which has not been freed at the end of the program:
#include <stdlib.h>
void *p;
int main() {
p = malloc(7);
p = 0; // The memory is leaked here.
return 0;
}

% clang -fsanitize=address -g memory-leak.c


% ./a.out
==23646==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x4af01b in __interceptor_malloc /projects/compiler-rt/lib/asan/asan_malloc_linux.cc:52:3
#1 0x4da26a in main memory-leak.c:4:7
#2 0x7f076fd9cec4 in __libc_start_main libc-start.c:287
SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).

• This tool is also under active development and available for Linux, macOS, NetBSD

Systems Programming
Undefined Behaviour
• Semantic bugs
• For certain operations, the C standard demands no particular behaviour.
These are typically cases that are considered bugs, e.g. dereferencing a null pointer
• It is expensive to check every time a pointer is dereferenced if the operation is valid
• Because the compiler does not have to ensure a particular behaviour for null pointers,
it can assume that the programmer has ensured that the pointer is not null
• Undefined behaviour is therefore crucial for fast code, but makes detection of bugs
much harder, as it is not guaranteed that a bug will result in a program crash
• A good introduction to undefined behaviour is this series of blog posts:
What Every C Programmer Should Know About Undefined Behavior

Systems Programming
Undefined Behaviour Sanitizer
• Undefined behaviour sanitizer detects various types of undefined behaviour
• Here, an integer overflow is detected:

int main(int argc, char **argv) {


int k = 0x7fffffff; // this is the largest possible signed int value ...
k += argc; // ... this will produce an integer overflow
return 0;
}

% clang -fsanitize=undefined -Wall -Werror intOverflow.c -o intOverflow


% ./intOverflow
intOverflow.c:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot
be represented in type 'int'

Systems Programming
Next week
• We are going to look at resource management and ownership
• We will particularly focus on memory management

Systems Programming
Systems Programming
Memory Management & Ownership

Yehia Elkhatib

Systems Programming
Housekeeping
• CW1 date parsing
– It is enough to check that 0<day<32, 0<month<13, year is > 0
• To access lldb on the servers, you need to use this command:
scl enable llvm-toolset-7.0 bash
• Please fill the questionnaire to get feedback for all Level 3 Term 1 courses in
preparation for the Oct 31st Staff Student Liaison Committee (SSLC) meeting
https://forms.office.com/r/TtyerkNHL6 - link also on Moodle

Systems Programming 1
Overview

• C++ Crash Course


• RAII

Systems Programming
Why C?
• Versatile PL: whole OS, compilers, networking, games, web apps, …
• Linux kernel
• Systems with constrained resources
• Builds understanding of resource management

• …but no garbage collection like other languages

Systems Programming
Memory Management Challenges
• When we allocate memory ourselves with malloc, we are responsible for calling free
• We must call free exactly once for each address we obtained from malloc
• It is good practice to assign the NULL value to pointers that have been freed
But this does not prevent all double free errors:
int main() {
void * ptr1 = malloc(sizeof(int));
void * ptr2 = ptr1;
free(ptr1); ptr1 = NULL;
free(ptr1); // OK. Calling free with NULL is fine
free(ptr2); // *** error for object 0x7f9355c00690: pointer being freed
// was not allocated
}

• Another problem are dangling pointers which point to locations which have been freed:
int main() {
node * left_child = create_tree("a", NULL, NULL);
node * root = create_tree("b", left_child, NULL );
destroy_tree(left_child); // now: root->left_child points to freed memory!
}

Systems Programming
More Memory Management Challenges
• If we never call free for a heap-allocated pointer we call this a memory leak
#include <stdlib.h>
#include <stdio.h>

int main() {
char *mem = (char*) malloc( sizeof(char) * 20); // allocate some memory

if (!mem) { exit(EXIT_FAILURE); } // check if allocation went fine

// use allocated memory


mem = (char*) malloc( sizeof(char) * 30); // allocate more memory
// we just lost the pointer to the old memory
// use newly allocated memory

free(mem); // free newly allocated memory


// we leaked the memory allocated first
}

Systems Programming
Ownership
• To organise memory management we adopt the concept of ownership
• Ownership means that we identify a single entity that is responsible for
managing a location in memory
• For example we might say that the parent in a tree owns its children
i.e. it is responsible for allocating and freeing the memory used by its children
• C does not assist in this as it does not enforce an ownership model
• Next we will learn a feature in C++ that helps us to follow an ownership model

Systems Programming
C++ Quick Intro
• C++ is a superset of C
– has almost double the keywords in C
• Main difference: OOP
– focus is on objects rather than functions
o an object is a cookie cutter; instances are cookies
– OOP principles: Inheritance, Encapsulation, Abstraction, Polymorphism
• Some library differences
– e.g. <iostream.h> instead of <stdio.h>
• Compile with clang++
– use the std flag to specify the language standard
– e.g. clang++ -std=c++17 demo.cpp

Systems Programming
Ownership in C++: RAII
• In C++ we can express ownership of a heap memory location explicitly in the code
• The creator of C++ referred to this as: RAII - Resource acquisition is initialization
1. We tie the management of a resource to the lifetime of a variable on the stack
2. The allocation of the resource (e.g. the malloc call) is done when we create the
variable
3. The deallocation of the resource (e.g. the free call) is done when the variable is
destroyed
int main() {
std::vector<int> v = {1, 2, 3, 4, 5, 6}; // heap memory is allocated
// use v
} // memory on the heap will be freed automatically

• This programming idiom is implemented by struct data types with 2 member functions:
– The constructor, called when creating a variable of this type
– The destructor, called when a variable of this type reaches the end of its lifetime

Systems Programming
Implementing RAII in C++ - Example
• If we want to store a number of integer values on the heap, we create a
special struct data type:
struct ints_on_the_heap { // This is C++ code and not legal C!
int * ptr; // dynamic array of ints
// constructor
ints_on_the_heap(int size) {
ptr = (int*)malloc(sizeof(int) * size);
}
// destructor
~ints_on_the_heap() { free(ptr); }
};
typedef struct ints_on_the_heap ints_on_the_heap;

• Initializing a variable on the stack with this data type allocates memory on the heap
• Once the variable on the stack reaches it lifetime it will automatically free heap memory
int main() {
ints_on_the_heap i(23); // i is on stack; 23 int spaces are allocated on heap
i.ptr[22] = 42;
} // automatic call to ~ints_on_the_heap will free heap memory

Systems Programming
Resource management with RAII
• This principle is applicable not just for memory management, but for general
resource management (files, network, database connections, ...)
• For example we can encapsulate the opening and closing of a file using RAII:
struct file_handle {
FILE * file;
file_handle(const char * filename) {
file = fopen(filename, "r");
if (!file) { exit(EXIT_FAILURE); }
}
~file_handle() { fclose(file); }
};
typedef struct file_handle file_handle;
int main() {
file_handle f("filename"); // call to file_handle(filename) opens file
// use f.file
} // call to ~file_handle() closes file

• RAII is used everywhere in C++ for resource management


– We will see further examples later in the course

Systems Programming
Modern Memory Management in C++
• C++ provides common RAII-based data types for easy and non-leaking memory
management
• To store multiple values of the same type on the heap, use std::vector:
int main() {
std::vector<int> v = {1, 2, 3, 4, 5, 6}; // heap memory is allocated
// use v
} // memory on the heap is be freed automatically here

https://en.cppreference.com/w/cpp/container/vector
• For storing a single value on the heap we should use one of two “smart” pointers:
– std::unique_ptr for unique ownership
the default case, when a single variable owns the value on the heap
– std::share_ptr for shared ownership
when it is not possible to identify a single owner (e.g. multi-threaded progs)
• We use plain pointers for non-owning relationships

Systems Programming
Binary Tree with Smart Pointers
(example of unique ownership)
#include <memory>
struct node {
const void * value_ptr;
// The parent owns both children uniquely
std::unique_ptr<struct node> left_child;
std::unique_ptr<struct node> right_child;
// Each child knows their parent, but doesn't own it
struct node* parent;
node(const void * value_ptr_) {
value_ptr = value_ptr_;
left_child = NULL; right_child = NULL; parent = NULL }
// destructors of left/right_child are auto-called to free their heap memory
~node() { }
void add_left_child(const void* value_ptr) {
// make_unique allocates heap memory for 1 node and calls the node-constructor
left_child = std::make_unique<struct node>(value_ptr); }
void add_right_child(const void* value_ptr) {
right_child = std::make_unique<struct node>(value_ptr); }
};

Systems Programming
Task
• There are two implementations of a binary tree, one in C and one in C++
• T1: try to execute both examples
• T2: read the source code and notice the differences between them

Systems Programming
Next week
• There is no lecture next week!
• You are invited to work on your coursework in the lab
• In two weeks, we will start looking at threads and how to write multi-threaded
applications

Systems Programming
Systems Programming (Part 2)
Concurrent Systems Programming
Phil Trinder

A PThread Tutorial:
https://www.cs.cmu.edu/afs/cs/academic/class/15492-f07/www/pthreads.html

Systems Programming
Chapter 01
1

Recap of lectures so far

In the introduction to systems programming, we have learned that:


• data types give bit representations in memory a meaning
• structs allow us to implement custom data structures (like linked lists or trees)
• every variable is stored at a fixed memory location
• a pointer is a variable storing the address of a memory location
• in C/C++ memory on the stack is automatically managed but memory on the heap must be
managed manually
• to organise resource and memory management we should think about ownership
• in C++ ownership is implemented by tying the resource management to the lifetime of a
stack variable
• smart pointers and containers make it easier manage memory, and to avoid leaks
• debuggers, static and dynamic analysis tools help to detect and fix bugs
• In the second part of the course we are looking into concurrent systems programming

Systems Programming
Chapter 01
2

What is concurrency?

• The ability of different program parts (typically functions) to be executed


simultaneously.
• There are many different methods to enable concurrent programs. We
distinguish two classes:
• Shared memory locations are read and modified to communicate
between concurrent components. Requires synchronisation to ensure
that communication happens safely.
We will look at explicit programming with threads and use shared
memory communication.
• Message passing tends to be far easier to reason about than shared-
memory concurrency, and is typically considered a more robust form of
concurrent programming. Examples of message passing systems are the
actor model implemented in Erlang or CSP-style communication in Go.
Covered in the Distributed and Parallel Technologies H/M course

Systems Programming
Chapter 01
3

Concurrency vs. Parallelism

We make a clean distinction between Concurrency and Parallelism.


Concurrency is a _programming paradigm_, wherein threads are used typically for dealing with
multiple asynchronous events from the environment, or for structuring your program as a
collection of interacting agents.
Parallelism, on the other hand, is just about making your programs go faster. You shouldn't need
threads to do parallelism, because there are no asynchronous stimuli to respond to. It just so
happens that it's possible to run a concurrent program in parallel on a multiprocessor, but that's
just a bonus.
The main point: to make your program go faster, you shouldn’t have to make it concurrent.
Concurrent programs are hard to get right, parallel programs needn't be.
Simon Marlow on the Haskell-cafe mailing list

But when people hear the word concurrency they often think of parallelism, a
related but quite distinct concept. In programming, concurrency is the composition
of independently executing processes, while parallelism is the simultaneous
execution of (possibly related) computations. Concurrency is about dealing with
lots of things at once. Parallelism is about doing lots of things at once.
Andrew Gerrand on the Go blog

Systems Programming
Chapter 01
4

Processes vs. Threads

Processes
• Every C program we have written is executed as a process
• Multiple processes can be executed simultaneously
• Each process has its own memory address space

Threads
• A thread of execution is an independent sequence of program
instructions
• Multiple threads can be executed simultaneously
• A process can have multiple threads sharing the same address
space of the process
• We will use threads to implement concurrent programs

Systems Programming
Chapter 01
5

Thread implementations

• There are thread implementations in almost all programming languages


• We will use the C pthread library and C++ threads
• Many threading implementations are conceptually very similar, and follow a lifecycle:
– A thread is created
– starts executing a specified function
– with some arguments
– Is given an identifier
• Wait for another thread to terminate
• Interrupt/kill another thread
– Terminates either by explicitly calling exit or when its function terminates
• Communication between threads is by modifying the state of shared variables

Systems Programming
Chapter 01
6

POSIX Threads

• POSIX Threads (short pthreads) is the most commonly used threading


implementation for C
• To use it we need to #include <pthread.h>
and specify a compiler flag –lpthread

#include <stdio.h>
int main() {
#include <stdlib.h> pthread_t thread;
#include <pthread.h> int error = pthread_create(&thread, NULL,
PrintHelloWorld, NULL) ;
#include <assert.h>
assert(error == 0);
printf("Created thread\n");
void * PrintHelloWorld(void*) { error = pthread_join(thread, NULL);
printf("Hello World from a thread!\n"); assert(error == 0);
}
return NULL;
} clang -Wall -Werror program.c -lpthread -o program

Systems Programming
Chapter 01
7

Creating a thread

Threads are created with the pthread_create function


int pthread_create(pthread_t *thread,
const pthread_attr_t *attr,
void *(*start_routine)(void*),
void *arg);
It takes four arguments:
1. A thread identifier, i.e. a pointer to a memory location of type pthread_t.
2. Thread attributes which set properties such as scheduling policies or stack size.
Passing NULL results in default attributes.
3. A function pointer to the start_routine.
This function takes a single argument of type void* and returns a value of type void*.
4. The argument that is passed to start_routine.
It returns 0 if the thread is created successfully or a non-zero error code.
Passing pointers to and from start_routine allows the passing of arbitrary data.
It also requires care to ensure that the memory locations pointed to have appropriate lifetimes.
Systems Programming
Chapter 01
8

Waiting for another thread to Terminate

To wait for another thread to terminate we use pthread_join


int pthread_join(pthread_t thread, void **value_ptr);
It takes two arguments:
1. A thread identifier.
2. A pointer to a memory location of type void*.
The return value of the start_routine passed to pthread_create will be copied to this location.
It returns 0 on success and a non-zero error code otherwise.

Systems Programming
Waiting for another thread to Terminate

To wait for another thread to terminate we use pthread_join


int pthread_join(pthread_t thread, void **value_ptr);
It takes two arguments:
1. A thread identifier.
2. A pointer to a memory location of type void*.
The return value of the start_routine passed to pthread_create will be copied to this location.
It returns 0 on success and a non-zero error code otherwise.

Example of returning a single int value from a thread:


int* return_value;
int error = pthread_join(thread, (void**)&return_value);
assert(error == 0);
if (return_value) { printf("return_value: \n", *return_value); }
// maybe: free(return_value);
Systems Programming
Chapter 01
10

To Do Now

Completing these activities will reinforce your understanding of the


material, and reinforce your understanding of the material
In the “Examples from Lecture 6” folder on Moodle
• Review the code for pthread_hello_world.c
• Compile it and and run it
• Add another print statement “Last thread about to die” just before the
main thread terminates

Systems Programming
Chapter 01
11

Introducing Mutual Exclusion

• The Mutual Exclusion problem was first identified by Edsger Dijkstra in 1965
• This problem (and its solution) is the beginning of the computer science of concurrency
• What is the problem that mutual exclusion is solving?
• Consider two threads T0 and T1 attempting to increment a count variable with a C compiler
that implements count++ as
register1 = count
register1 = register1 + 1
count = register1
• Consider this execution interleaving with “count = 5” initially:
T0: register1 = count {T0 register1 = 5}
T0: register1 = register1 + 1 {T0 register1 = 6}
T1: register1 = count {T1 register1 = 5}
T1: register1 = register1 + 1 {T1 register1 = 6}
T1: count = register1 {count = 6 }
T0: count = register1 {count = 6}
Q: Should the value of count be 6? What has gone wrong? How can we fix it?
Systems Programming
Chapter 01
12

Mutual Exclusion Example 2

• Consider a singly linked list:

• Two threads simultaneously start removing the elements from the list.
• The first thread removes “b” by moving the next pointer of “a” to “c”:

• The second thread removes “c” by moving the next pointer of “b” to “d”:

• The resulting list is inconsistent, i.e. it still contains node “c” (the item deleted by the second
thread)!
• We can avoid inconsistency by ensuring that simultaneous updates to the same part of the list
cannot occur
Systems Programming
Chapter 01
13

Critical Regions
& Race Conditions

• A critical region is a part of the code that updates some shared variable(s) (like count), or a
shared data structure (like the list)
• Critical regions must be protected so that concurrent threads don’t interfere with each other.
Mutual exclusion means that only one thread ever executes a critical section
• A race condition occurs when the result of program execution depends on the order in which
threads are executed, e.g. a different interleaving of T0 and T1 with “count = 5” would
produce:
T0: register1 = count {T0 register1 = 5}
T0: register1 = register1 + 1 {T0 register1 = 6}
T0: count = register1 {count = 6}
T1: register1 = count {T1 register1 = 6}
T1: register1 = register1 + 1 {T1 register1 = 7}
T1: count = register1 {count = 7 }

Q: Demonstrate a race condition for the list deletion

Systems Programming
Chapter 01
14

Locks Protect
Critical Regions

• Associate a lock (aka a mutex) with each critical section


• Before executing a critical section a thread must acquire the lock
• On leaving the critical section the thread must release the lock
• The semantics of acquire are:
if another thread owns the lock, the requesting thread is blocked
until the owning thread releases the lock
If the lock is not owned, the requesting thread is granted ownership
• On release of a lock, one of the threads (if any) blocked waiting for the
lock will be granted ownership and continue execution

Systems Programming
Chapter 01
15

Simple Lock Example

• If both threads T0 and T1 execute:


acquire(theLock);
count++;
release(theLock);
the threads will execute without interfering

• That is either T0 will complete before T1, or T1 before T0

Systems Programming
Chapter 01
16

Bounded Buffer

• Problem first proposed by Dijkstra in 1972


• A bounded buffer is a buffer with a fixed size
• Producers insert elements into the buffer
• Consumers remove elements from the buffer
• We have to ensure that:
producers wait when the buffer is full.
consumers wait when the buffer is empty.
threads resume work when space or data is available again.

Systems Programming
Chapter 01
17

Unsafe Bounded Buffer

int count = 0;
int in = 0;
int out = 0;
Type buffer[N];

//PRODUCER //CONSUMER
for (;;) { for (;;) {
// produce an item and put in nextProduced while (count == 0)
while (count == N) ; // do nothing
; // do nothing nextConsumed = buffer[out];
buffer [in] = nextProduced; out = (out + 1) % N;
in = (in + 1) % N; count--;
count++; // consume the item in nextConsumed
} }

Q: Show two interactions of producer and consumer that can lead to


inconsistency
Systems Programming
Chapter 01
18

Locked Bounded Buffer

Lock theLock;
int count = 0;
int in = 0;
int out = 0;
Type buffer[N];

//PRODUCER //CONSUMER
for (;;) { for (;;) {
// produce an item and put in nextProduced acquire(theLock);
acquire(theLock); while (count == 0)
while (count == N) ; // do nothing
; // do nothing nextConsumed = buffer[out];
buffer [in] = nextProduced; out = (out + 1) % N;
in = (in + 1) % N; count--;
count++; release(theLock);
release(theLock); // consume the item in nextConsumed
} }

In this version the buffer will not become inconsistent. Are we done?
Systems Programming
Chapter 01
19

Locks can cause


Deadlock

• Consider the situation where count == 0 and the Consumer


executes
while (count == 0)
;
• While executing this loop, the Consumer owns theLock, so the
Producer cannot acquire it to place a new item in the buffer and
increment count
• As a result, we have deadlock: neither thread can make progress
• Q: Show a different case where the Locked Bounded Buffer can
deadlock

Systems Programming
Chapter 01
20

Busy Waiting
Deadlock Solution

• We can avoid deadlock if when a thread must wait for some condition
it releases the locks it owns
• So in our example the consumer repeatedly:
Acquire Lock
If (count == 0) Release Lock
• This is called Busy Waiting (or sometimes polling)
• Wastes CPU cycles and energy!

Systems Programming
Chapter 01
21

Tea maker with busy waiting

pthread_mutex_t m;
bool teaIsReady = false;

void *me(void* arg) { (void)arg; void *teaRobot(void* arg) { (void) arg;


printf("Me: Waiting for my tea ...\n"); printf(" Tea Robot: Making tea ...\n");
pthread_mutex_lock(&m); usleep(randInt());
while (!teaIsReady) { pthread_mutex_lock(&m);
pthread_mutex_unlock(&m); teaIsReady = true;
printf("Me: (Unamused) // do nothing\n"); pthread_mutex_unlock(&m);
pthread_mutex_lock(&m); } printf(" Tea Robot: Done!\n");
pthread_mutex_unlock(&m); return NULL; }
printf("Me: (Happy) ... finished waiting.\n");
return NULL; }
int main() {
pthread_t t1; pthread_t t2; int err;
err = pthread_mutex_init(&m, NULL); assert(err == 0);
err = pthread_create(&t1, NULL, me, NULL); assert(err == 0);
err = pthread_create(&t2, NULL, teaRobot, NULL); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
err = pthread_join(t2, NULL); assert(err == 0);
pthread_mutex_destroy(&m); }
Systems Programming
Chapter 01
22

Condition Variable
Deadlock Solution

• Busy waiting wastes CPU cycles and energy


• It would be better to block the thread seeking to acquire a lock, and wake it when it has a
chance to proceed
• Achieved using a condition variable cv:

pthread_cond_wait(&cv, &m)
must be called with a locked mutex m
It releases the mutex and blocks the calling thread on cv

pthread_cond_signal(&cv)
assigns mutex ownership to one of the threads blocked on cv, and wakes it

Systems Programming
Chapter 01
23

Condition Variable
Deadlock Solution

• The condition may have changed before the wakened thread is scheduled, so it’s important to
check the condition again:
pthread_mutex_lock(&m);
while (!cond) {
pthread_cond_wait(&cv, &m);}

• Condition variables are analogous to hardware signals/interrupts

Systems Programming
Chapter 01
24

Tea maker with


condition variables
pthread_mutex_t m; pthread_cond_t cv;
bool teaIsReady = false;

void *me(void* arg) { (void)arg; void *teaRobot(void* arg) { (void) arg;


pthread_mutex_lock(&m); printf(" Tea Robot: Making tea ...\n");
while (!teaIsReady) { usleep(randInt());
printf("Me: Waiting for my tea ...\n"); pthread_mutex_lock(&m);
pthread_cond_wait(&cv, &m); } teaIsReady = true;
printf("Me: (Happy) ... finished waiting.\n"); pthread_mutex_unlock(&m);
printf("Me: Is the tea really finished? %s\n", pthread_cond_signal(&cv);
teaIsReady ? "Yes" : "No"); printf(" Tea Robot: Done!\n");
pthread_mutex_unlock(&m); return NULL; }
return NULL; }
int main() {
pthread_t t1; pthread_t t2; int err;
err = pthread_mutex_init(&m, NULL); assert(err == 0);
err = pthread_cond_init(&cv, NULL) ; assert(err == 0)
err = pthread_create(&t1, NULL, me, NULL); assert(err == 0);
err = pthread_create(&t2, NULL, teaRobot, NULL); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0); err = pthread_join(t2, NULL); assert(err == 0);
pthread_cond_destroy(&cv)
Systems; Programming
pthread_mutex_destroy(&m); }
Chapter 01
25

To Do Now

Attempt the questions on the preceding slides


In the “Examples from Lecture 6” folder on Moodle
• Review the code for the tea maker examples tea1.c, tea2.c and tea3.c
• Compile and run the examples a couple of times
• Compare the results, and reflect on them

Systems Programming
Chapter 01
26

Bounded Buffer
With Condition Variables

• We need two condition variables:


– Producer must wait until there is space in the buffer: add
– Consumer must wait until there are item(s) in the buffer: remove
• This bounded buffer is a monitor – a class that provides thread safe
access to a shared resource (the buffer).
• In a monitor all (public) functions, like addItem, removeItem
provide mutual exclusion
• A Java class where all public methods are synchronised is a
monitor

Systems Programming
Chapter 01
27

Bounded Buffer Monitor


Main
struct BoundedBuffer {
int start; int end; int size; int* buffer;
pthread_mutex_t m;
pthread_cond_t add;
pthread_cond_t remove;
};

typedef struct BoundedBuffer BoundedBuffer;

BoundedBuffer * createBoundedBuffer(int size) { ... }


void destroyBoundedBuffer(BoundedBuffer * bb) { ... }
void addItem(BoundedBuffer * bb, int item) { ... }
int removeItem(BoundedBuffer * bb) { ... }
void * producer(void * arg) { ... } int main() {
void * consumer(void * arg) { ... } pthread_t t1; pthread_t t2; int err;
BoundedBuffer * bb = createBoundedBuffer(4);
err = pthread_create(&t1, NULL, consumer, bb); assert(err == 0);
err = pthread_create(&t2, NULL, producer, bb); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
err = pthread_join(t2, NULL); assert(err == 0);
}
Systems Programming
Chapter 01
28

Bounded Buffer Monitor


Producer

void * producer(void * arg) {


BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
int item = randInt();
printf("produced item %d\n", item);
addItem(bb, item);
usleep(randInt());
}
return NULL;
}

Systems Programming
Chapter 01
29

Bounded Buffer Monitor


Consumer

void * consumer(void * arg) {


BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
int item = removeItem(bb);
printf(" consumed item %d\n", item);
usleep(randInt());
}
return NULL;
}

Systems Programming
Chapter 01
30

Bounded Buffer Monitor


AddItem

void addItem(BoundedBuffer * bb, int item) {


If (!bb) return;
pthread_mutex_lock(&bb->m);
while (bb->start == bb->end) { // buffer is full
printf("== Buffer is full ==\n");
pthread_cond_wait(&bb->add, &bb->m);
}
// buffer is no longer full
bb->buffer[bb->start] = item;
bb->start = (bb->start + 1) % bb->size; // move start one forward
pthread_mutex_unlock(&bb->m);
pthread_cond_signal(&bb->remove);
}

Systems Programming
Chapter 01
31

Bounded Buffer Monitor


RemoveItem

int removeItem(BoundedBuffer * bb) {


if (!bb) assert(0);
pthread_mutex_lock(&bb->m);
while ( ((bb->end + 1) % bb->size) == bb->start ) { // buffer is empty
printf("== Buffer is empty ==\n");
pthread_cond_wait(&bb->remove, &bb->m);
}
// buffer is no longer empty
bb->end = (bb->end + 1) % bb->size; // move end one forward
int item = bb->buffer[bb->end];
pthread_mutex_unlock(&bb->m);
pthread_cond_signal(&bb->add);
return item;
}

Systems Programming
Chapter 01
32

Bounded Buffer Monitor


Reflection

• The buffer is implemented as a ring buffer


• Two condition variables (add and remove) signal the two different conditions (buffer
is full or empty)
Q: Can you implement the buffer with a single condition variable?
Q: Do you recommend implementing the buffer with a single condition variable?
• Before inserting or removing an element the size of the buffer is checked and the
threads wait if there is no space (producer) or no data (consumer)
• After inserting or removing an element the thread signals that data or space is now
available
• The producer and consumer functions to not need to use locking themselves they
use a monitor

Systems Programming
Chapter 01
33

To Do Now

Attempt the questions on the preceding slides


In the “Examples from Lecture 6” folder on Moodle
• Review the code for the bounded buffer:
bounded_buffer_unsafe.c
bounded_buffer_safe.c
• Compile and run the examples a couple of times
• Compare the results, and reflect on them

Systems Programming
Chapter 01
34

Concurrent Programming is Hard

A concurrent poses all of the challenges of sequential


Computation: i.e. what to compute. A correct and
efficient Algorithm, using appropriate data structures
must be constructed.
A concurrent program must also specify a correct and
effective strategy for Coordination: i.e. how threads
should cooperate.

Systems Programming
Chapter 01
35

Some Important Thread


Coordination Aspects

Partitioning: determining what parts of the computation should be


separately evaluated, e.g. a thread to serve each request, to render
each frame of a film.
Data Sharing: What data to share between threads, and when to
acquire it e.g. film rendering threads may hold 2 frames: 1 being
processed and another ready to go as soon as the current frame is
complete
Synchronisation: ensuring threads can cooperate without
interference, e.g. if threads representing 2 players compete to get a
single resource then only one succeeds.

Systems Programming
Chapter 01
36

Coordination Abstraction Levels

You have probably used many notations for specifying, designing & constructing
computations but relatively few for coordination.
Computations can be written in languages with different levels of abstraction,
e.g.
Low-level Mid-level High-Level
Assembler Java Haskell, SQL
C Python Prolog

Systems Programming
Chapter 01
37

Coordination Abstraction Levels

You have probably used many notations for specifying, designing & constructing
computations but relatively few for coordination.
Computations can be written in languages with different levels of abstraction,
e.g.
Low-level Mid-level High-Level
Assembler Java Haskell, SQL
C Python Prolog
Likewise coordination can be written in languages with different levels of
abstraction, e.g.
Low-level Mid-level High-Level
Locks, Go OpenMP
Semaphores Monitors(Java threads)
C++ Threads Erlang

Systems Programming
Chapter 01
38

Concurrent Coordination
options for Languages
A programming language may have several concurrency options,
often competing. For example
C++ has:
Thread libraries, e.g. POSIX
std threads

Java has:
Thread libraries, e.g. POSIX
Java threads
Executors

Systems Programming
Chapter 01
39

Higher Level Concurrency

In the remainder of the course we will


• Cover some widely used, and higher-level, concurrency
constructs in C and C++
• Illustrate how higher-level constructs can be implemented using
lower-level constructs.
– the bounded buffer example above shows a monitor implemented
using mutexes & condition variables

Systems Programming
40

Semaphores

• Another classical synchronization primitive, also invented by by Edsger


Dijkstra
• A (counting) semaphore holds an integer counter and provides 2
operations:
wait decrements the counter and blocks if it is less than zero
signal increments the counter and wakes waiting threads
• We can count how many items of a resource are used and block access
if no resources are available

Systems Programming
Chapter 01
41

Semaphores

• A binary semaphore (with a count of 1)


– combines a mutex and a condition variable
– Allows a single thread into a critical section: blocking/reawakening
threads
• We can use a pair of semaphores to ensure a correct usage of an
unsafe bounded buffer implementation

Systems Programming
Chapter 01
42

Bounded Buffer
Semaphore Implementation
struct BoundedBuffer { int start; int end; int size; int* buffer; };
typedef struct BoundedBuffer BoundedBuffer;

sem_t fillCount; // data in the buffer


sem_t emptyCount; // free space in the buffer

BoundedBuffer * createBoundedBuffer(int size) { ... }


void destroyBoundedBuffer(BoundedBuffer * bb) { ... }
void addItem(BoundedBuffer * bb, int item) { ... }
int removeItem(BoundedBuffer * bb) { ... }
void * producer(void * arg) { ... }
void * consumer(void * arg) { ... }

int main() {
pthread_t t1; pthread_t t2; int err;
BoundedBuffer * bb = createBoundedBuffer(4);
fillCount = sem_create(0, 4); // no data in the buffer yet
emptyCount = sem_create(4, 4); // all spaces in the buffer are free
err = pthread_create(&t1, NULL, consumer, bb); assert(err == 0);
err = pthread_create(&t2, NULL, producer, bb); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);}
Systems Programming
Chapter 01
43

Bounded Buffer
Semaphore: Producer

void * producer(void * arg) {


BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
sem_wait(emptyCount);
int item = randInt();
printf("produced item %d\n", item);
addItem(bb, item);
sem_signal(fillCount);
usleep(randInt());
}
return NULL;
}

Systems Programming
Chapter 01
44

Semaphores are simpler


than locks & cond. vars

int removeItem(BoundedBuffer * bb) {


if (!bb) assert(0);
pthread_mutex_lock(&bb->m); sem_wait(emptyCount);
while ( ((bb->end + 1) % bb->size) == bb->start ) {
printf("== Buffer is empty ==\n");
pthread_cond_wait(&bb->remove, &bb->m);
}
// buffer is no longer empty
bb->end = (bb->end + 1) % bb->size; // move end one forward
int item = bb->buffer[bb->end];
pthread_mutex_unlock(&bb->m); sem_signal(fillCount);
pthread_cond_signal(&bb->add);
return item;
}

Systems Programming
Chapter 01
45

Bounded Buffer
Semaphore: Consumer

void * consumer(void * arg) {


BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
sem_wait(fillCount);
int item = removeItem(bb);
printf(" consumed item %d\n", item);
sem_signal(emptyCount);
usleep(randInt());
}
return NULL;
}

Systems Programming
46

Semaphore Implementation

• Uses a combination of a mutex and a condition variable:


typedef struct sem {
pthread_mutex_t mut; // mutex to protect value
pthread_cond_t cv; // condition variable to signal change to value
unsigned int value; // the current value of the counter
unsigned int maxval; // the maximum value of the counter
} Semaphore;
• wait decrements the counter and blocks if it goes below zero:
void sem_wait (Semaphore* s) {
pthread_mutex_lock(&(p->mut));
while (p->value <= 0) { pthread_cond_wait(&(p->cv), &(p->mut)); }
p->value--;
pthread_mutex_unlock(&(p->mut)); }
• signal increments the counter and signals waiting threads:
void sem_signal(Semaphore* s) {
pthread_mutex_lock(&(p->mut));
p->value++;
pthread_cond_signal(&(p->cv));
Systems Programming
Concurrency Reflection

• Safely and correctly managing concurrent threads that share


state is tricky
• You must avoid problems of
Deadlock
Livelock: threads run, but make no progress
Starvation: some thread never makes progress
• If you use locks you must correctly lock, unlock, and wait to
arrange safe access

Systems Programming
Chapter 01
48

To Do Now

In the “Examples from Lecture 6” folder on Moodle


• Review the code for the semaphore bounded buffer:
bounded_buffer_sem.c
• Compile and run the program a couple of times & reflect on the results

Systems Programming
Systems Programming
C++ Threading
Phil Trinder

Additional Material:
Concepts: Chapter 42 of Stroustrup, B., 2013. The C++ programming
language. Pearson Education.
Programming constructs:
en.cppreference.com

Systems Programming
Review of Lecture 6

Introducing Concurrency
– We learned that concurrency is a programming paradigm to deal with lots of
things at once
– This is distinct from parallelism which is about making a program run faster
– Threads execute concurrently inside of the same process and share the same
address space
– Communication between threads uses shared memory and requires careful
synchronisation
Low Level Concurrency
– We learned how to create and wait for a thread using pthreads in C
– Race conditions can be avoided using mutual exclusion which we ensure with a
mutex/lock
– Condition variables avoid busy waiting for a condition to become true
Systems Programming 1
Review of Lecture 6

Higher Level Concurrency


– We distinguished coordination & computation, and learned that concurrency
can be specified at different levels of abstraction
– We can design data structures like a bounded buffer that internally use
synchronisation mechanisms to ensure thread-safe usage: Monitors
– Counting/binary Semaphores are higher level, combining mutex, condition
variable & a counter

Systems Programming 2
This lecture: higher level
concurrency in C++

• C++ provides
– low-level pthread constructs
– a mid-level thread implementation as part of the standard library (since C++
2011)
– higher-level threading constructs
• We are going to learn some C++ features that are particularly useful for
programming with threads

Systems Programming
Hello World from a C++ thread
#include <stdio.h>
#include <thread>
int main() {
auto t = std::thread([](){
printf("Hello World from a cpp thread!\n");
});
t.join();
}
• Q: what happens if we leave out the t.join()?
• We need to #include <thread> and compile with -std=c++11 (or greater)
to specify the C++ standard:
% clang++ -std=c++17 -stdlib=libc++ -pthreads
-o hello_world hello_world.cpp
• There are various major "modern" C++ standards: c++11, c++14, c++17, c++20
• Pick the latest supported by your compiler!
• In the example we use two features of C++ that we haven't seen so far: auto and
lambdas
Systems Programming
auto: Local Type Inference

In C++ it is possible (and encouraged) to use the auto keyword instead of the name of
the type when declaring variables:
auto i = 42; // means the same as: int i = 42;
The compiler will infer (i.e. figure out) the type of the variable from the initialisation
expression
Crucially the variable still has a data type and the compiler knows it, we just do not
need to write it down
This feature becomes in particular handy for long type names, e.g.:
auto v = std::vector<int>{1, 2, 3, 4, 5, 6, 7, 8, 9};
// using explicit type name:
for (std::vector<int>::iterator iter = v.begin(); iter != v.end(); iter++) { /*...*/ }
/ / using auto instead:
for (auto iter = v.begin(); iter != v.end(); iter++) { /*...*/ }

Systems Programming
Lambda expressions

We have seen a function pointer used to pass a function as an argument,


e.g. to create threads with pthread:
void * PrintHelloWorld(void* arg) { printf("Hello World\n"); }
int main() {
// ...
int error = pthread_create(&thread, NULL, PrintHelloWorld, NULL);
// ... ^ this is passed as a function pointer
}
By writing a lambda expression we can write a function without naming it:

int main() { auto t = std::thread( [](){ printf("Hello World\n"); } ); t.join(); }

Other names for lambdas are: function literal or anonymous functions


Lambda expressions refer to the lambda calculus invented by Alonzo Church in the
1930s
They are available in almost all modern programming languages
Systems Programming
Lambda expressions C++ syntax

Lambda expressions in C++ are written as:


[ /*captures*/ ] ( /*params*/ ) { /*body*/ }
The captures lists variables from the surrounding scope that are passed to
the lambda when it is created
The params list the parameters that are passed to the lambda when it is
called
The body is the function body that executes when the lambda is called
As for function parameters variables are captured by-value i.e. they are
copied

Systems Programming
Sharing a Value with a
Lambda expression

We can share access to a variable by capturing a pointer to it using this notation:


int main() {
auto l = list{}; l.append_to_list('a'); l.append_to_list('b’);
auto t1 = std::thread([l_ptr = &l] (){ l_ptr->remove_from_list(1); });
t1.join();
}

Systems Programming
Threads and Lambdas

Lambda expressions are very useful for writing applications with threads
We start a thread in C++ by constructing an std::thread object with a
function pointer or lambda
In pthreads function pointers are passed as void* to threads, losing the
type information!
void* remove_second_element_from_list(void* arg) {
struct list* list = (struct list*)arg; // restore the meaning of the pointer
// ...
}
int main() {
struct list* list = create_list(create_node('a')); // ...
pthread_create(&t1, NULL, remove_second_element_from_list, list ); // ...
}

Systems Programming
Lambda Expressions Preserves
Function Types

Capturing values for a lambda expression preserves their types

int main() {
auto l = list{}; l.append_to_list('a'); // ...
auto t1 = std::thread([l_ptr = &l] (){ l_ptr->remove_from_list(1); }); // ...
}

Systems Programming
Mutual exclusion:
remembering to unlock

Recall that
• mutual exclusion means that no two threads simultaneously enter a
critical section
• Commonly managed with a mutex and call lock() before we enter
and unlock() when we leave a critical section

But it’s very easy to forget to call unlock when code has complicated
control flow or exceptions

Q: What happens if some control flow fails to unlock a mutex?


Q: What happens if some control flow fails to lock a mutex?

Systems Programming
Mutual exclusion in C++

C++ avoids the issue of forgetting to unlock a resource by viewing locking


mutex as owning a resource and applying the RAII (Resource Acquisition Is
Initialization) technique:
We create a local variable on the stack that locks the mutex
At the end of the lifetime the variable releases the lock automatically
#include <mutex>
std::mutex m; // mutex variable; shared across threads
void foo() {
std::unique_lock<std::mutex> lock(m); // acquire the mutex
// ... do some work
} // releases the mutex by calling m.unlock();

Systems Programming
Thread Safe Linked List in C++

We encapsulate an std::list<int> and add a thread-safe interface protected by a


(private) std::mutex
#include <list> #include <thread> #include <optional> #include <mutex>
struct list {
private:
std::list<int> list; std::mutex mutex; // mutex to protect critical section
public:
void append_to_list(int value) {
std::unique_lock<std::mutex> lock(mutex); // lock mutex: enter critical section
list.push_back(value);
} // mutex will be automatically unlocked here
std::optional<int> remove_from_list(int position) {
std::unique_lock<std::mutex> lock(mutex); // lock mutex: enter critical section
auto iter = list.begin();
while (position > 0 && iter != list.end()) { iter++; position--; }
if (position != 0 || iter == list.end()) { return {}; /* nothing to return */ }
int value = *iter;
list.erase(iter);
return value;
} // mutex will be automatically unlocked here
}; Systems Programming
Thread Safe Linked List in C++

std::list<int>::iterator begin() { return list.begin(); }


std::list<int>::iterator end() { return list.end(); }
}

int main() {
auto l = list{}; l.append_to_list('a'); l.append_to_list('b'); l.append_to_list('c’);
auto t1 = std::thread([l_ptr = &l](){ l_ptr->remove_from_list(1); });
auto t2 = std::thread([l_ptr = &l](){ l_ptr->remove_from_list(1); });
t1.join(); t2.join();
}

Q: Why don’t we need to lock the mutex for the begin() and end() methods?

Q&A Demo: Unsafe linked list

Systems Programming
To Do Now

Starting with the “Examples from Lecture 7” folder on Moodle


1. Review the code for cppthreads_hello_world.cpp
Compile it and run it
2. Review the code for cppthreads_hello_world_bad.cpp
Compile it and run it. Write a sentence explaining what went wrong,
and how to fix it.
3. Review the code for linked_list_safe.cpp
Compile it and run it
4. Can you make the linked list implementation that is unsafe? That is, an
implementation that allows the threads to interfere and demonstrates
a race condition?
Hint: there are different ways to do this, and I’ll demo one in the Q&A
Hint: you’ll probably need a larger list and more threads
Systems Programming
Chapter 01
16

Bounded Buffer

• Problem first proposed by Dijkstra in 1972


• A bounded buffer is a buffer with a fixed size
• Producer threads insert elements into the buffer
• Consumer threads remove elements from the buffer
• We have to ensure that:
producers wait when the buffer is full.
consumers wait when the buffer is empty.
threads resume work when space or data is available again.

Systems Programming
Condition variables in C++

Condition variables are a synchronisation mechanism to wait for conditions without busy
waiting

In C++ we directly provide the test for the condition we are waiting for to the wait call:
cv.wait(m, [](){ return cond; });
This idiom checks the condition after the thread is woken up, guaranteeing that cond is
true after the statement is executed.

Aside: For pthreads we learned to always use a while loop when using a condition
variable:
pthread_mutex_lock(&m);
while (!cond) {
pthread_cond_wait(&cv, &m); }

Systems Programming
PThreads Condition variables

For pthreads we always use a while loop when using a condition variable:
pthread_mutex_lock(&m);
while (!cond) {
pthread_cond_wait(&cv, &m); }

Be aware that pthreads and C++ threads use inverse logic!


• In C pthread the while checks if the condition is not true and may repeatedly
wait
• In C++ we write the check the other way around to express that we wait for the
condition to become true

Systems Programming
Bounded Buffer in C++

The C++ implementation is simpler than C with pthreads:


• We only need to provide a thread safe wrapper for an std::vector
• The std::vector automatically takes care of the memory management
struct BoundedBuffer {
private:
int start; int end; int size;
std::vector<int> buffer; // use std::vector<int> for automatic memory management
std::mutex m;
std::condition_variable add_cv; // there is space to add an item to the buffer
std::condition_variable remove_cv; // there is an item in the buffer that can be removed
public:
BoundedBuffer(int max_size) { ... } // constructor to create bounded buffer
void addItem(int item) { ... } // thread-safe interface to add items to the buffer
int removeItem() { ... } // thread-safe interface to remove items from the buffer
};
int main() {
auto bb = BoundedBuffer{4}; // create a bounded buffer object
auto consumer = std::thread([bb = &bb]{ ... }); // start consumer thread
auto producer = std::thread([bb = &bb]{ ... }); // start producer thread
consumer.join(); producer.join(); // wait for both threads to finish
Systems Programming
}
Bounded Buffer in C++
Producer

int main() {

// start producer thread and capture pointer to the shared bounded buffer
auto producer = std::thread([bb = &bb]{
for (int i = 0; i < 10; i++) {
int item = randInt();
printf("produced item %d\n", item);
bb->addItem(item);
std::this_thread::sleep_for(std::chrono::milliseconds(randInt()));
}
});
consumer.join(); producer.join(); // wait for both threads to finish
}

The consumer implementation is similar

Systems Programming
Bounded Buffer in C++
Constructor

struct BoundedBuffer {
private:
int start; int end; int size;
std::vector<int> buffer; // use vector for automatic memory management
std::mutex m;
std::condition_variable add_cv;
std::condition_variable remove_cv;
public:
BoundedBuffer(int max_size) { // constructor to create bounded buffer
start = 0;
end = max_size-1;
size = max_size;
// no manual malloc (and free) required when using std::vector
buffer = std::vector<int>(size);
}
void addItem(int item) { ... } // thread-safe interface to add items to the buffer
int removeItem() { ... } // thread-safe interface to remove items from the buffer
};
Systems Programming
Bounded Buffer in C++
addItem

struct BoundedBuffer {
private:
...
public:
BoundedBuffer(int max_size) { ... }

void addItem(int item) { // thread-safe interface to add items to the buffer


std::unique_lock<std::mutex> lock(m); // acquire lock: enter critical section
add_cv.wait(lock, // wait with the condition variable 'add_cv’
[this]{ return start != end; }); // for the cond 'start != end' to become true
buffer[start] = item;
start = (start + 1) % size;
remove_cv.notify_one(); // notify possibly waiting thread that an item is in the buffer
} // lifetime of 'lock' reached; release lock: exit critical section

};
Systems Programming
Bounded Buffer in C++
removeItem

struct BoundedBuffer {
private:
...
public:
BoundedBuffer(int max_size) { ... }

int removeItem() { // thread-safe interface to remove items from the buffer


std::unique_lock<std::mutex> lock(m); // acquire lock: enter critical section
remove_cv.wait(lock, // wait with the condition variable 'remove_cv’
[this]{ return ((end + 1) % size) != start; }); // for this condition to become true
end = (end + 1) % size;
int item = buffer[end];
add_cv.notify_one(); // notify waiting threads that there is free space in the buffer
return item;
} // lifetime of 'lock' reached; release lock: exit critical section …
};
Systems Programming
To Do Now

Starting with the “Examples from Lecture 7” folder on Moodle


1. Review the code for the bounded buffer with condition variables:
bounded_bufer_safe.cpp
Compile it and run it a few times

Systems Programming
Asynchronous Tasks

• So far we have created threads explicitly and used low-level


coordination
• Higher-level abstractions are easier to use and harder to get wrong
• Instead of threads we can think about tasks that should be handled
concurrently
• Instead of synchronisation we can think about how these tasks
communicate
• C++ provides a high-level interface to launch an asynchronous task:
std::async …

Systems Programming
Launching tasks with std::async

int fib(int n) { ... } // computes the nth fibonacci number

int main() {
// launch task to computer fibonacci number of 6
auto f6 = std::async([] { return fib(6); });
// while we are waiting for the task to finish we compute the fibonacci number of 7
auto f7 = fib(7);
printf("fib(7) = %d\n", f7);
// now we access the result of the task and wait if it is not yet finished computing
printf("fib(6) = %d (computed asynchronously)\n", f6.get() );
}

Systems Programming
Futures std::future

A task launched with std::async returns an std::future:


int main() {
std::future<int> f6 = std::async([] { return fib(6); });
// ...
printf("fib(6) = %d (computed asynchronously)\n", f6.get() );
}
A future is a handle representing a value that is not yet computed

Systems Programming
Futures std::future

We can pass a future around or store it somewhere, for example

int main() {
auto fs = std::vector<std::future<int>> ();
// launch 10 asynchronous tasks to compute fibonacci numbers
for (int i = 0; i < 10; i++) { fs.push_back(std::async([] { return fib(i+1); })); }
// ... do some other work; then print the computed numbers
for (int i = 0; i < 10; i++) { printf("fib(%d) = %d\n", i+1, fs[i].get()); }
}

Call future.get() to retrieve the value. Blocks calling thread until the value
has been computed

Systems Programming
Futures Example: parallel sum
Reduce the time to compute the sum of all values in an array by using multiple threads
Use divide & conquer by recursively splitting the array in half, and creating a thread to
compute each half, e.g.
[1..10000]
/ \
[1..5000] [5001..10000]
/ \ / \
[1..2500] [2501..5000] [5001..7500] [7501..1000]
/ \ / \ / \ / \

Conceptually we could keep dividing until we are summing arrays that in contain 0 or 1
element, but these threads would hardly take any time at all …
So it’s sensible to set a threshold, e.g. not divide intervals containing fewer than 1000
elements, instead compute these sequentially in a thread

Systems Programming
Futures Example: parallel sum

int parallelSum(std::vector<int>::iterator begin, std::vector<int>::iterator end) {


auto len = end - begin;
// compute sequentially for small arrays
if (len < 1000) { return std::accumulate(begin, end, 0); }
auto mid = begin + len/2;
// launch asynchronous task for the left half of the array
auto left_side = std::async([=] { return parallelSum(begin, mid); });
// compute right half of array recursively
int right_side = parallelSum(mid, end);
// block to wait for left side to finish
return left_side.get() + right_side;
}

int main() {
std::vector<int> vec = createLargeVector();
auto sum = parallelSum(vec.begin(), vec.end());
printf("sum: %d\n", sum);
}
Systems Programming
Parallel sum Performance

We run the program on an array with 106 elements


Sequential sum result: 5243777 (time: 5.491102 ms)
Parallel par_sum result: 5243777 (time: 135.894128 ms)
The parallel version is about 30 times slower than the sequential version!
Why?

Systems Programming
parallel sum thread analysis

[1..10000]
/ \
[1..5000] [5001..10000]
/ \ / \
[1..2500] [2501..5000] [5001..7500] [7501..1000]
/ \ / \ / \ / \

Q: How many threads are created at depth 3?


Q: How many threads are created at depth d?
Q: How many threads are created to sum the values in a 106 element array, with threshold
1000?

Systems Programming
A better parallel sum

The naïve parallel sum creates too many threads, and the time managing all of these
threads outweighs the time saved by running the threads
If we have a small number of cores (say 8) we can fix this, by only spawning threads for the
first few recursive calls:
int parallelSum(std::vector<int>::iterator begin, std::vector<int>::iterator end, int depth = 0 )
{
auto len = end - begin;
if (len < 1000 || depth > 3 ) { return std::accumulate(begin, end, 0); }
auto mid = begin + len/2;
auto left_side = std::async([=] { return parallelSum(begin, mid, depth + 1 ); });
int right_side = parallelSum(mid, end, depth + 1 );
return left_side.get() + right_side;
}
Resulting in a much improved parallel runtime:
sum result: 5243777 (time: 4.979168 ms) vs. par_sum result: 5243777 (time: 2.492056 ms)

Systems Programming
To Do Now

Starting with the “Examples from Lecture 7” folder on Moodle


1. Review the code for the naive parallel sum using asynchronous tasks :
simple_sum.cpp
Compile it and run it
2. Review the code for the parallel sum with a depth threshold:
sum.cpp
Compile it, run it
What do you conclude from the runtimes?
3. Review the code for the bounded buffer using asynchronous tasks:
bounded_bufer_safe_async.cpp
Compile it and run it a few times

Systems Programming
Communicating with an async task:
std::promise

A task created with async communicates its result via a future

Sometimes we want to name the channel that the task should write to: a
promise

We can see a future as the reading end of a communication channel:


some_future.get() extracts the value

The writing end of the channel is called a promise:


some_promise.set_value(42);

We can obtain a std::future object from a std::promise:


std::future<int> some_future = some_promise.get_future();
Systems Programming
Using a promise

// This sum function writes to the sum_promise


void sum(std::vector<int>::iterator begin,
std::vector<int>::iterator end,
std::promise<int> sum_promise) {
int sum = std::accumulate(begin, end, 0);
sum_promise.set_value(sum); // 4. write result
}

int main() {
auto numbers = std::vector<int>{ 1, 2, 3, 4, 5, 6 };
std::promise<int> sum_promise; // 1. create promise for an int
std::future<int> sum_future = sum_promise.get_future(); // 2. get future from promise
// 3. create thread that takes ownership of the promise (ownership transfer with std::move)
auto t = std::thread(sum, numbers.begin(), numbers.end(), std::move(sum_promise) );
printf("result = %d\n", sum_future.get() ); // 4. wait and then read result
t.join();
}

Systems Programming
Synchronisation with
std::promise<void>
Usually we communicate data over a channel, e.g. the results of a computation
Sometimes we only want to synchronise tasks e.g. wait until all worker threads are ready
before sending work
We can achieve this with
- an std::promise<void>, a promise to produce nothing (but say when you’re done)
- the std::future<T>::wait() method

void do_work(std::promise<void> barrier) {


std::this_thread::sleep_for(std::chrono::seconds(1)); // do something (like sleeping)
barrier.set_value(); // 4. send signal to other thread
}
int main() {
std::promise<void> barrier; // 1. create promise
std::future<void> barrier_future = barrier.get_future(); // 2. get future from it
auto t = std::thread(do_work, std::move(barrier) ); // 3. launch thread
barrier_future.wait(); // 4. wait for signal
… // 5. continue execution: we know that do_work has completed
t.join(); }
Systems Programming
Tasks as First Class Objects

Most programming languages aim to treat values of all types as first class,
e.g. you can use a value of any type anywhere, e.g. pass it as a parameter,
store it in a data structure, ...

Q: How true is this of languages you know? Can you pass as a parameter,
store in a data structure, return from functions: integers, strings,
structs/objects, functions?

We’ve seen that we can treat communication channels (aka


promises/futures) as first class in C++, but what about tasks?

Manipulating tasks is essential, e.g. to write a thread scheduler, perhaps


for a virtual machine.
Systems Programming
Manipulating tasks with
std::packaged_task
We can
• wrap up a task for future execution using std::packaged_task
• extract the value with task.get_future()

int main() {
auto task = std::packaged_task<int(int,int)>([](int a, int b) { return pow(a, b); }) ;
std::future<int> result = task.get_future() ;

// The task can now be stored, or passed as a parameter.


// When we are ready to use it either
// launch task in the same thread via:
// task(2, 10);
// or start a new thread:
auto t = std::thread(std::move(task) , 2, 10);
t.join();
printf("task result: %d\n", result.get() );
}
Systems Programming
To Do Now

Starting with the “Examples from Lecture 7” folder on Moodle

1. Review and run promise.cpp


What happens if you call sum_future.get()a second time?
Adapt the program to print the sum twice.

2. Review and run promise_void.cpp


Adapt the main function to wait for the barrier signal, i.e. call
barrier_future.wait()a second time

3. Review and run packaged_task.cpp


Adapt the program to evaluate task(3,8)without creating a thread
What happens if the main function evaluates the task a second time, say with
task(4,7)? Explain why.

Systems Programming
Systems Programming
Revision Lecture
Yehia Elkhatib, Phil Trinder

Email Yehia.Elkhatib@Glasgow.ac.uk
Phil.Trinder@Glasgow.ac.uk

Systems Programming
Chapter 01
1

Part 1: Systems Programming

In this course we have studied systems programming. We have learned, that:


• data types give bit representations in memory a meaning
• structs allow us to implement custom data structures (like linked lists or trees)
• every variable is stored at a fixed location memory
• a pointer is a variable storing the address of a memory location
• memory on the stack is automatically managed but memory on the heap must be
managed manually
• to organise resource and memory management we should think about ownership
• in C++ ownership is implemented by tying the resource management to the lifetime
of a stack variable
• there exist a set of smart pointers and containers for easy and non-leaking memory
management

Systems Programming
Chapter 01
2

Part 2: Concurrency

Concurrency Concepts
You should be able to
• explain what concurrency is (a programming paradigm to deal with lots of
things at once)
• Explain how threads
• execute concurrently inside a process and share the same address space
• communicate using shared memory
• can encounter issues like race conditions, busy waiting etc.
• require careful synchronisation

Systems Programming
Chapter 01
3

Part 2: Concurrency

Low Level Concurrent programming


• You should be able to read and write thread-safe code using pthreads in C
• Specifically to
o provide mutual exclusion with a mutex/lock
o use condition variables to avoid busy waiting
Higher Level Concurrency
• Be familiar with the classic bounded buffer example
• To be able to read & write code that uses thread-safe abstractions:
o Monitors
o Counting & binary Semaphores

Systems Programming
Chapter 01
4

Part 2: Concurrency

Higher level concurrency in C++


You should be able to
l read & write code to create threads and synchronise them using mutexes
l read & write code to launch async tasks and synchronise/communicate using
futures/promises
l Explain the differences between the C++ coordination abstractions, and their
advantages and disadvantages

Systems Programming
Online Exam

• This year the exam will be online


• It will be open book
• You will have 2 hours in total to complete the exam
• Instructions will be given on Moodle on how to submit
answers

Systems Programming
Questions asking for code!

• You do not need to write syntactically correct code! We do not


expect you to compile and test your code for the exam! Write the
exam in the same way as if you were in the university exam halls
with no access to a computer!
• Trying to write, compile and execute your answers would simply
take too long and you will not be able to do the full exams in the
time you have

Systems Programming
What next?

Questions?

What to revise next?


– Revise lecture and lab materials
– Use the past exams as a guide

Office hours to answer questions:


– Part 1 (Yehia): Monday 24th 13:00—14:00 on MS Teams
– Part 2 (Phil): Monday 24th 14:00—15:00 in S101c Lilybank Gardens

Systems Programming
May 2019
(Duration: 1 hour 30 minutes)

DEGREES OF MSci, MEng, BEng, BSc, MA and MA (Social Sciences)

System Programming H
Mock Exam
With Answers
(Answer all 3 questions.)

This examination paper is worth a total of 60 marks

The use of a calculator is not permitted in this examination

INSTRUCTIONS TO INVIGILATORS
Please collect all exam question papers and exam
answer scripts and retain for school to collect.
Candidates must not remove exam question papers.
1. C Programming and Data Types

(a) Describe how strings are represented in C. How much bytes of memory do the
string literal “Hello World” require in a C program? [2]

Strings are represented as arrays of individual characters, each represented by a


value of type char. The end of the string is indicated by the special character \0.
The string “Hello World” therefore is represented by 12 characters (including the
\0) character and requires 12 bytes of memory.

(b) Implement a binary search tree of strings in the C programming language.

Sketch the implementation based on the following interface. Minor syntax errors
will not be penalised. Make sure that you handle pointers correctly and that you
do not leak memory. Provide comments explaining your implementation.

// a node in the tree should store a string label


typedef struct node Node;

// create a Node with the given label. Returns NULL if unsuccessful


Node* node_create(const char* label);

// destroys the node and connected nodes in the tree


void node_destroy(Node* n);

// inserts a node in the right place to maintain the order of the tree
void insert(Node* n, const char* label);

// lookup a node in the tree. Returns NULL if the label is not present
Node* lookup(Node* n, const char* label);

You should use these string handling functions:

int strcmp(const char* s1, const char* s2);


returns 0 is s1 and s2 are equal, a value <0 if s1 is smaller than s2, >0 if s1 is
bigger than s2

int strlen(const char* s);


returns the number of printable characters in the string

char* strcpy(char* destination, const char* source);


copies the string at source to the destination

Systems Programming Mock Exam 1


/END
- Define the struct. A Node should store a string label. [2]

struct {
char* label;
Node* left_child;
Node* right_child;
};

- Implement node_create and node_destroy.


Be careful to handle all allocations and pointers properly. [6]

Node* node_create(const char* label) {


if (!label) return NULL;
Node* n = malloc(sizeof(Node));
if (!n) return NULL;

n->label = malloc(strlen(label)+1);
if (!n->label) { free(n); return NULL; }
strcpy(n->label, label);
n->left_child = NULL;
n->right_child = NULL;

return n;
}

Node* node_destroy(Node* n) {
if (!n) return;
node_destroy(n->left_child);
node_destroy(n->right_child);
free(n->label);
free(n);
}

Systems Programming Mock Exam 2


/END
- Implement insert_node. Ensure that the node is inserted at the right place to
maintain the search order of the tree. If a node with the given label exists
already in the tree no node should be added. [4]

void insert(Node* n, const char* label) {


if (!n) return;
int cmp = strcmp(n->label, label);
if (cmp == 0) return;
if (cmp < 0) {
if (n->left_child == NULL) {
n->left_child = node_create(label);
} else {
insert(n->left_child, label);
}
}
if (cmp > 0) {
if (n->right_child == NULL) {
n->right_child = node_create(label);
} else {
insert(n->right_child, label);
}
}
}

- Implement lookup. [3]

Node* lookup(Node* n, const char* label) {


if (!n) return NULL;
int cmp = strcmp(n->label, label);
if (cmp < 0) { return lookup(n->left_child, label); }
if (cmp > 0) { return lookup(n->right_child, label); }
// cmp must be 0
return n;
}

Systems Programming Mock Exam 3


/END
- Implement a main function that uses the provided interface to create a binary
search tree with three nodes with the labels “Systems”, “Programming”, “2019”.
Write a function that prints all values in the tree and show how it is used. The
tree should be destroyed at the end of the program and there should be no
memory leaks. [3]

void print(Node* n) {
if (!n) return;
print(n->left_child);
print(n->right_child);
printf(“%s\n”, n->label);
}

int main() {
Node* n = node_create(“Systems”);
insert(n, “Programming”);
insert(n, “2019”);

print(n);

node_destroy(n);
}

TOTAL MARKS [20]

Systems Programming Mock Exam 4


/END
2. Memory and Resource Management & Ownership

(a) Explain the purpose of a void-pointer in C. Give an example use case. [2]

A void pointer allows referring to values of an arbitrary type. The programmer is


forced to cast the void pointer into a pointer of a concrete data type before
dereferencing it. Void pointers are essential for implementing generic data types
that should store values of any arbitrary type.

(b) Explain the difference between the two memory regions stack and heap.
Explain briefly how the programmer manages memory on the heap in C. [2]

The stack is automatically managed by the compiler for this the size of every
variable must be known without executing the program. The heap is managed
explicitly by the programmer by using the malloc function to request memory of
a certain size and deallocating it by calling free.

(c) Values are passed to function via call-by-value. Explain briefly what this means.
Explain how arrays are treated when passed to a function and how this relates to
pointers. [3]

Call-by-value means that arguments are copied into the parameter of a function.
Arrays are not copied when passes into a function, instead the address (i.e.
pointer) to the first argument is copied. Therefore, the value of arrays can be
modified inside of a function.

(d) Explain what a segmentation fault is and describe a strategy for finding out
which part of a C program triggered this error. Give a common cause for a
segmentation fault. [3]

A segmentation fault is raised by the hardware notifying the operating system


that your program has attempted to access a restricted area of memory.
A strategy to find the responsible part of the C program is to use a debugger and
use the backtrace feature which shows the call stack at the time of the
segmentation fault.
The most common cause for a segmentation fault is the dereferencing of a NULL
pointer.

Systems Programming Mock Exam 5


/END
(e) Explain the concept of Ownership for memory management. Describe the
benefits over the direct use of malloc and free. Explain how in C++ the RAII
technique works and how this relates to the lifetime of variables and what role
the special constructor and destructor functions play in this context. [4]

Ownership means that we identify a single entity which is responsible for


managing a location in memory.
The benefits over malloc and free are that common problems of manual memory
management are avoided by construction, for example: double free errors
(calling free multiple times) or memory leaks (not calling free at all).

RAII in C++ ties the management of memory to the lifetime of a variable on the
stack. For this we create a special struct that defines a constructor which
allocates the resource (i.e. calls malloc) and a destructor that deallocates the
resource (i.e. calls free). A variable of this struct type on the stack will allocate
the memory when it is constructed and automatically deallocate it when the
variable goes out of scope.

(f) C++ provides two main smart pointers: std::unique_ptr and std::shared_ptr.
Explain the difference between the two and discuss in which situation which one
should be used. [2]

The std::unique_ptr models unique ownership when we can identify a unique


owner of the resource that is responsible for managing its lifetime. An example
would be the children of a node in a tree – the children will automatically be
deallocated whenever the node is deallocated itself.
The std::shared_ptr models shared ownership for situations where it is not
possible to find a unique owner of the resource. An example would be a graph
without cycles (such as a DAG) where multiple a node can have multiple
incoming edges. A node is automatically deallocated when the last incoming
edge disappears, but not before.

Systems Programming Mock Exam 6


/END
(g) Give the definition of a struct for a Node with a string label in a directed
acyclic graph (DAG) [a graph with directed edges and without cycles] that uses
C++ containers and C++ smart pointers to model and manage the graph data
structure. Explain your solution briefly.
Would a similar implementation be possible for a graph with cycles?
Justify your answer.

Describe (but do not implement) what implementation would be required in C to


achieve a correct and leak free implementation. [4]

struct node {
std::string label;
std::vector<std::shared_ptr<struct node>> edges;
};

The outgoing edges of a node a stored in a vector (an array that can dynamically
grow or shrink). As there is no unique ownership (there can be multiple incoming
edges into a node) we use a shared pointer to model the shared ownership.

This design is problematic if there are cycles in the graph, as then the shared_ptr
form a cycle as well and no memory is released.

An implementation in C would need to ensure that all nodes in the graph are
freed after they are no longer accessible. For this a counter could be maintained
in the node that counts the incoming edges. Once that reaches zero the node can
be freed.

TOTAL MARKS [20]

Systems Programming Mock Exam 7


/END
3. Concurrent Systems Programming

(a) Describe the term Thread and distinguish it from the term Process. [2]

A thread of execution is an independent sequence of program instructions.


Multiple threads can be executed simultaneously.
Multiple threads share the same address space of a single process.
The operating system ensures that addresses spaces of processes are protected
from each other.

(b) Give an example showing the need for mutual exclusion to ensure the
correctness of the results of a computation. [2]

Consider the simultaneous removal of elements from a linked list.


Starting from:
a -> b > c -> d -> NULL
Two threads simultaneously start removing elements from the list.
The first thread removes b by moving the next pointer of a to c:
a ---> c -> d -> NULL
b-/
The second removes c by moving the next pointer of b to d:
a ---> c -> d -> NULL
b-/
The resulting list still contains node c (even though it was deleted explicitly).

(c) C++ provides futures and promises as higher-level synchronisation abstractions.


Describe how they work together to simplify writing multi-threaded code. [2]

A future is a handle representing a value that is not yet computed.


We can pass this handle around or store it in memory.
Once we require the value we call .get() which waits until the value has been
computed.

The promise is the “other side” of a future, allowing to provide a value (fulfilling
the promise) once it has been computed.

Without future and promise the value would have to be explicitly protected by a
mutex (or another low level primitive) and a condition variable that could be
used to wait for the value to be computed.

Systems Programming Mock Exam 8


/END
(d) Look at the following API definition of the ts_set class.

struct ts_set {
private:
std::set<int> set;
std::mutex m;

public:
std::set<int>::iterator find(int i) {
std::unique_lock<std::mutex> lock(m);
return set.find();
}
std::set<int>::iterator begin() {
std::unique_lock<std::mutex> lock(m);
return set.begin();
}
std::set<int>::iterator end() {
std::unique_lock<std::mutex> lock(m);
return set.end();
}
void insert(int i) {
std::unique_lock<std::mutex> lock(m);
set.insert(i);
}
void erase(std::set<int>::iterator pos) {
std::unique_lock<std::mutex> lock(m);
set.erase(pos);
}
};

The provided interface and implementation is not safe to use in multithreaded


code despite using a mutex. Explain why and give an example describing a
problematic scenario using the API with multiple threads.
[4]

The interface is not thread safe. Iterators pointing to an element could be


invalidated by another thread by removing (or adding) elements.

Two potential problematic scenarios are:


1. A call to find returns an iterator, but immediately after the call the
corresponding element is removed from the set, so that the iterator is now
invalid.
2. Calls to find and end test if a value is in the set and only if this is the case a
call to insert should add the value. This is impossible to achieve with this
interface, as the mutex is released between the individual operations and allow
for the tested property (that the value is not in the set) to become false.

Systems Programming Mock Exam 9


/END
(e) Design a thread safe interface for a stack of int values with a limited size and
implement it in C++ or PThreads.

The stack should allow to push new elements onto the stack and to pop (i.e.
remove) elements from the stack.
When the stack is full push should block and wait until there is again space,
similarly pop should block when the stack is empty until there are elements to
be removed.
Use condition variables in your solution to avoid busy waiting. [6]

#define N 4
struct stack {
private:
int values[N];
int index = 0;
std::mutex m;
std::condition_variable cv_not_full;
std::condition_variable cv_not_empty;

public:
void push(int i) {
std::unique_lock<std::mutex> lock(m);
cv_not_full.wait(lock, [this]{ return index < N; });
values[index] = i;
index++;
cv_not_empty.notify_one();
}

int pop() {
std::unique_lock<std::mutex> lock(m);
cv_not_empty.wait(lock, [this]{ return index > 0; })
index--;
int i = values[index];
cv_not_full.notify_one();
return i;
}
};

Systems Programming Mock Exam 10


/END
Implement try_push and try_pop that will not block but instead immediately
return when the stack is full/empty. Both functions should indicate if the
operation was performed successfully or not.
[4]

bool try_push(int i) {
std::unique_lock<std::mutex> lock(m);
if (index >= N) {
return false;
} else {
values[index] = i;
index++;
return true;
}
}

std::optional<int> try_pop() {
std::unique_lock<std::mutex> lock(m);
if (index <= 0) {
return std::optional<int>();
} else {
index--;
return values[index];
}
}

TOTAL MARKS [20]

Systems Programming Mock Exam 11


/END

You might also like