Comp322 s19 Lec01 Slides v1 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

COMP 322 / ELEC 323:

Fundamentals of
Parallel Programming
Lecture 1: Task Creation & Termination
(async, finish)

Instructors: Mack Joyner, Zoran Budimlić


Department of Computer Science, Rice University
{mjoyner, zoran}@rice.edu

http://comp322.rice.edu

COMP 322 Lecture 1 07 January 2019


Special thanks to Vivek Sarkar!

COMP 322 Lecture 1 07 January 2019


Your Teaching Staff!
• Undergraduate TAs
— Liam Bonnage, Harrison Brown, Mustafa El-Gamal,
Krishna Goel, Ryan Green, Ryan Han, Rishu Harpavat,
Namanh Kapur, Tian Lan, Tam Le, Will LeVine, Eva Ma,
Hamza Nauman, Rutvik Patel, Aryan Sefidi, Jeemin
Sim, Tory Songyang, Jiaqi Wang, Erik Yamada, Yifan
Yang, Aydin Zanagar
• Graduate TAs
— Jonathan Sharman, Srdjan Milakovic
• Instructors
— Mack Joyner, Zoran Budimlić
3 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)
Moore’s Law and Dennard Scaling

Gordon Moore (co-founder of Intel) predicted in


1965 that the transistor density of Dennard Scaling states
semiconductor chips would double roughly that power for a fixed
every 1-2 years (Moore’s Law)
chip area remains
area of transistor halves every 1-2 years
constant as transistors
feature size reduces by √2 every 1-2 years
grow smaller
Slide source: Jack Dongarra

4 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Recent Technology Trends
• Chip density (transistors) is Source: Intel, Microsoft (Sutter) and Stanford
(Olukotun, Hammond)
increasing ~2x every 2 years

• Clock speed is plateauing


below 10 GHz so that chip
power stays below 100W

• Instruction-level parallelism
(ILP) in hardware has also
plateaued below 10
instructions/cycle

• ⇒ Parallelism must be
managed by software!

5 COMP 322, Spring 2018 (M. Joyner, Zoran Budimlić)


What is Parallel Computing?
• Parallel computing: using multiple processors in parallel to solve
problems more quickly than with a single processor and/or with less
energy
• Example of a parallel computer
—An 8-core Symmetric Multi-Processor (SMP) consisting of four dual-
core chip microprocessors (CMPs)

CMP-0 CMP-1 CMP-2 CMP-3

Source: Figure 1.5 of Lin & Snyder


book, Addison-Wesley, 2009

6 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


All Computers are Parallel Computers ---
Why?

7 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Parallelism Saves Power
(Simplified Analysis)
Nowadays (post Dennard Scaling), Power ~ (Capacitance) * (Voltage)2 * (Frequency)
and maximum Frequency is capped by Voltage
! Power is proportional to (Frequency)3

Baseline example: single 1GHz core with power P

Option A: Increase clock frequency to 2GHz ! Power = 8P

Option B: Use 2 cores at 1 GHz each ! Power = 2P

• Option B delivers same performance as Option A with 4x less power … provided


software can be decomposed to run in parallel!

8 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


What is Parallel Programming?
• Specification of operations that can
Task A Task B
be executed in parallel
• A parallel program is decomposed
into sequential subcomputations
called tasks

Core 0 Core 1
Parallel programming constructs
define task creation, termination, and L1 cache L1 cache

interaction BUS

L2 Cache

Schematic of a dual-core
Processor

9 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Example
To introduce of a Sequential
you to a concrete Program:

example of parallel programming, let us fir
algorithm for computing the sum of the elements of an array of numbers
Computing the sum of array elements
Algorithm 1: Sequential ArraySum
Computation Graph
Input: Array of numbers, X.
Output: sum = sum of elements in array X.
sum 0;
for i 0 to X.length 1 do
sum sum + X[i];
return sum;

Observations:
This algorithm is simple to understand since it sums the elements of
•However,
The decision to sum
we could haveupobtained
the elements
the from
sameleft
algebraic result by summin
to rightThis
instead. was over-specification
arbitrary of the ordering of operations in sequen
referred to as the Von Neumann bottleneck [1]. The left-to-right evalua
•seenThein computation graphgraph
the computation shows that all
shown in Figure 1. We will study compu
operations
course. must
For now, be executed
think of each node sequentially
or vertex (denoted by a circle) as an
edge (denoted by an arrow) as an ordering constraint between the oper
flow
10
of the outputCOMP
from the first operation to the input of the second op
322, Spring 2019 (M. Joyner, Z. Budimlić)
Parallelization Strategy for two cores
(Two-way Parallel Array Sum)
Task 0: Compute sum of Task 1: Compute sum of
lower half of array upper half of array

+"

Compute total sum

Basic idea:
• Decompose problem into two tasks for partial sums
• Combine results to obtain final answer
• Parallel divide-and-conquer pattern

11 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Async and Finish Statements for Task
Creation and Termination (Pseudocode)
async S finish S
" Execute S, but wait until all
• Creates a new child task that
asyncs in S’s scope have
executes statement S
terminated.

// T0(Parent task) T1 T0
STMT0; STMT0
finish { //Begin finish
async { fork
STMT1; //T1(Child task)
} STMT1 STMT2
STMT2; //Continue in T0
//Wait for T1
join
} //End finish
STMT3; //Continue in T0 STMT3
12 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)
since there is no edge or sequence of edges connecting Tasks T 2 and T 3. This indicates that tasks
Two-way Parallel Array Sum
can execute in parallel with each other; for example, if your computer has two processor cores, T
using async & finish constructs
can be executed on two di↵erent processors at the same time. We will see much richer examples
programs using async, finish and other constructs during the course.

Algorithm 2: Two-way Parallel ArraySum


Input: Array of numbers, X.
Output: sum = sum of elements in array X.
// Start of Task T1 (main program)
sum1 0; sum2 0;
// Compute sum1 (lower half) and sum2 (upper half) in parallel.
finish{
async{
// Task T2
for i 0 to X.length/2 1 do
sum1 sum1 + X[i];
};
async{
// Task T3
for i X.length/2 to X.length 1 do
sum2 sum2 + X[i];
};
};
// Task T1 waits for Tasks T2 and T3 to complete
// Continuation of Task T1
sum sum1 + sum2;
return sum;

13 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Course Syllabus
• Fundamentals of Parallel Programming taught in three modules
1. Parallelism
2. Concurrency
3. Locality & Distribution
• Each module is subdivided into units, and each unit into topics
• Lecture and lecture handouts will introduce concepts using pseudocode notations
• Labs and programming assignments will be in Java 8
—Initially, we will use the Habanero-Java (HJ) library developed at Rice as a pedagogic
parallel programming model
– HJ-lib is a Java 8 library (no special compiler support needed)
– HJ-lib contains many features that are easier to use than standard Java threads/
tasks, and are also being added to future parallel programming models
—Later, we will learn parallel programming using standard Java libraries, and
combinations of Java libs + HJ-lib

14 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Grade Policies
Course Rubric
• Homework (5) 40% (written + programming components)
• Weightage proportional to # weeks for homework
• Exams (2) 40% (scheduled midterm + scheduled final)
• Labs 10% (labs need to be checked off by Monday)
• Quizzes 5% (on-line quizzes on Canvas)
• Class Participation 5% (in-class worksheets)

15 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


Next Steps
• IMPORTANT:
—Bring your laptop to this week’s lab at 4pm on Thursday (HH
100)
—Watch videos for topics 1.2 & 1.3 for next lecture on Wednesday
• HW1 will be assigned on Jan 9th and be due on Jan 23rd.
(Homework is normally due on Wednesdays.)
• Each quiz (to be taken online on Canvas) will be due on the Friday
after the unit is covered in class. The first quiz for Unit 1 (topics 1.1
- 1.5) is due by Jan 25.
• See course web site for syllabus, work assignments, due dates, …
• http://comp322.rice.edu

16 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)


OFFICE HOURS
• Regular office hour schedule can be found at
Office Hours link on course web site
• Send email to instructors (mjoyner@rice.edu,
zoran@rice.edu) if you need to meet some other
time this week
• And remember to post questions on Piazza!

17 COMP 322, Spring 2019 (M. Joyner, Z. Budimlić)

You might also like