0% found this document useful (0 votes)

14 views56 pages

L4-GraphAlgorithms v4

- Graph algorithms can be used to analyze data that involves relationships between entities represented as nodes in a graph. - A graph can be encoded as a set of edges between nodes, with optional edge types and weights. Common graph algorithms propagate information along edges to compute properties of interest. - MapReduce is well-suited for graph algorithms through an approach where each node is annotated with information, and annotations are propagated along edges through map and reduce functions. For example, determining if each node's friends consider it their best friend involves annotating nodes with friend lists and propagating this data along edges.

Uploaded by

Ba Do

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views56 pages

L4-GraphAlgorithms v4

Uploaded by

Ba Do

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

CS 4407

Algorithms

Graph algorithms in MapReduce

Basic Algorithms

Lecture adapted from:NETS 212: Scalable and Cloud Computing 1

What we have seen so far
• Initial algorithms
– map/reduce model could be used to filter, collect, and
aggregate data values

• Useful for data with limited structure

– We could extract pieces of input data items and collect them
to run various reduce operations
– We could “join” two different data sets on a common key

• But that’s not enough…

2
Beyond average/sum/count
• Much of the world is a network of relationships and
shared features
– Members of a social network can be friends, and may have
shared interests / memberships / etc.
– Customers might view similar movies, and might even be
clustered by interest groups
– The Web consists of documents with links
– Documents are also related by topics, words, authors, etc.

3
Goal: Develop a toolbox
• We need a toolbox of algorithms useful for analyzing
data that has both relationships and properties

• For the next ~2 lectures we’ll start to build this toolbox

– Compare the “traditional” and MapReduce solution

4
Plan for today
• Representing data in graphs NEXT

• Graph algorithms in MapReduce

– Computation model
– Iterative MapReduce
• A toolbox of algorithms
– Single-source shortest path (SSSP)
– k-means clustering
– Classification with Naïve Bayes

5
Images by Jojo Mendoza, Creative Commons licensed
Thinking about related objects

Facebook fan-of
fan-of
fan-of Mikhail
fan-of
fan-of Magna Carta
friend-of friend-of

Alice Sunita Jose

• We can represent related objects as a labeled,

directed graph
• Entities are typically represented as nodes;
relationships are typically edges
– Nodes all have IDs, and possibly other properties
– Edges typically have values, possibly IDs and other
properties
6
Encoding the data in a graph
Facebook
Mikhail
Magna Carta

Alice Sunita Jose

• Recall basic definition of a graph:

– G = (V, E) where V is vertices, E is edges of the
form (v1,v2) where v1,v2  V
• Assume we only care about connected vertices
– Then we can capture a graph simply as the edges
– ... or as an adjacency list: vi goes to [vj, vj+1, … ]
7
Graph encodings: Set of edges
Facebook
Mikhail
Magna Carta

Alice Sunita Jose

(Alice, Facebook)
(Alice, Sunita)
(Jose, Magna Carta)
(Jose, Sunita)
(Mikhail, Facebook)
(Mikhail, Magna Carta)
(Sunita, Facebook)
(Sunita, Alice)
(Sunita, Jose)
8
Graph encodings: Adding edge types
Facebook fan-of
fan-of
fan-of Mikhail
fan-of
fan-of Magna Carta
friend-of friend-of

Alice Sunita Jose

(Alice, fan-of, Facebook)

(Alice, friend-of, Sunita)
(Jose, fan-of, Magna Carta)
(Jose, friend-of, Sunita)
(Mikhail, fan-of, Facebook)
(Mikhail, fan-of, Magna Carta)
(Sunita, fan-of, Facebook)
(Sunita, friend-of, Alice)
(Sunita, friend-of, Jose)
9
Graph encodings: Adding weights
Facebook fan-of
fan-of
0.8
fan-of 0.5 Mikhail 0.7
0.7 fan-of fan-of Magna Carta
friend-of friend-of
0.5
0.9 0.3
Alice Sunita Jose

(Alice, fan-of, 0.5, Facebook)

(Alice, friend-of, 0.9, Sunita)
(Jose, fan-of, 0.5, Magna Carta)
(Jose, friend-of, 0.3, Sunita)
(Mikhail, fan-of, 0.8, Facebook)
(Mikhail, fan-of, 0.7, Magna Carta)
(Sunita, fan-of, 0.7, Facebook)
(Sunita, friend-of, 0.9, Alice)
(Sunita, friend-of, 0.3, Jose)

10
Recap: Related objects

• We can represent the relationships between related

objects as a directed, labeled graph
– Vertices represent the objects
– Edges represent relationships

• We can annotate this graph in various ways

– Add labels to edges to distinguish different types
– Add weights to edges
– ...

• We can encode the graph in various ways

– Examples: Edge set, adjacency list

11
Plan for today
• Representing data in graphs
• Graph algorithms in MapReduce NEXT

– Computation model
– Iterative MapReduce
• A toolbox of algorithms
– Single-source shortest path (SSSP)
– k-means clustering
– Classification with Naïve Bayes

12
A computation model for graphs
Facebook fan-of
fan-of
0.8
fan-of 0.5 Mikhail 0.7
0.7 fan-of fan-of Magna Carta
friend-of friend-of
0.5
0.9 0.3
Alice Sunita Jose

• Once the data is encoded in this way, we can

perform various computations on it
– Simple example: Which users are their friends' best friend?
– More complicated examples (later): Page rank, adsorption,
...
• This is often done by
– annotating the vertices with additional information, and
– propagating the information along the edges
– "Think like a vertex"! 13
A computation model for graphs

Facebook fan-of
fan-of
0.8
fan-of 0.5 Mikhail 0.7
0.7 fan-of fan-of Magna Carta
friend-of friend-of
0.5
0.9 0.3
Alice Sunita Jose Slightly more technical:
How many of my friends
have me as their
best friend?

• Example: Am I my friends' best friend?

14
Can we do this in MapReduce?

map(key: node, value: [<otherNode, relType, strength>])

{

}
reduce(key: ________, values: list of _________)
{

• Using adjacency list representation?

15
Can we do this in MapReduce?

map(key: node, value: <otherNode, relType, strength>)

{

}
reduce(key: ________, values: list of _________)
{

• Using single-edge data representation?

16
A computation model for graphs

Facebook fan-of
fan-of
0.8
fan-of 0.5 Mikhail 0.7
0.7 fan-of fan-of Magna Carta
friend-of friend-of
0.5
0.9 0.3
Alice Sunita Jose

• Example: Am I my friends' best friend?

– Step #1: Discard irrelevant vertices and edges

17
A computation model for graphs

Mikhail

friend-of friend-of
0.9 0.3
Alice Sunita Jose
alicesunita: 0.9 sunitaalice: 0.9 josesunita: 0.3
sunitajose: 0.3

• Example: Am I my friends' best friend?

– Step #1: Discard irrelevant vertices and edges
– Step #2: Annotate each vertex with list of friends
– Step #3: Push annotations along each edge

18
A computation model for graphs

Mikhail

friend-of friend-of
0.9 0.3
Alice Sunita Jose
sunitaalice: 0.9 alicesunita: 0.9 sunitaalice: 0.9
sunitajose: 0.3 josesunita: 0.3 sunitajose: 0.3
alicesunita: 0.9 sunitaalice: 0.9 josesunita: 0.3
sunitajose: 0.3

• Example: Am I my friends' best friend?

– Step #1: Discard irrelevant vertices and edges
– Step #2: Annotate each vertex with list of friends
– Step #3: Push annotations along each edge

19
A computation model for graphs

Mikhail

• Example: Am I my friends' best friend?

– Step #1: Discard irrelevant vertices and edges
– Step #2: Annotate each vertex with list of friends
– Step #3: Push annotations along each edge
– Step #4: Determine result at each vertex

20
A real-world use case
• A variant that is actually used in social networks today:
"Who are the friends of multiple of my friends?"
– Where have you seen this before?

• Friend recommendation!
– Maybe these people should be my friends too!

21
Generalizing…
• Now suppose we want to go beyond direct friend
relationships
– Example: How many of my friends' friends (distance-2
neighbors) have me as their best friend's best friend?
– What do we need to do?

• How about distance k>2?

• To compute the answer, we need to run multiple

iterations of MapReduce!

22
Iterative MapReduce
• The basic model:

copy files from input dir  staging dir 1

(optional: do some preprocessing)

while (!terminating condition) {

map from staging dir 1
reduce into staging dir 2
move files from staging dir 2  staging dir1
}

(optional: postprocessing)
move files from staging dir 2  output dir

• Note that reduce output must be compatible with

the map input!
– What can happen if we filter out some information in the mapper or in
23
the reducer?
Graph algorithms and MapReduce

• A centralized algorithm typically traverses a tree or

a graph one item at a time (there’s only one
“cursor”)
– You’ve learned breadth-first and depth-first traversals

• Most algorithms that are based on graphs make use

of multiple map/reduce stages processing one
“wave” at a time
– Sometimes iterative MapReduce, other times chains of
map/reduce

24
"Think like a vertex"
• Let's think about a different model for a bit:
– Suppose we had a network that has exactly the same
topology as the graph, with one node for each vertex
– Suppose each vertex A has some (A,sA) tuple in the
local state sA input file

– The computation proceeds in rounds. MapReduce

In each round: rounds

• Step #1: Each vertex A reads its local state sA map(A,sA) invocation
• Step #2: A can then send some messages mi map() emits a
to adjacent nodes Bi (Bi,mi) tuples

• Step #3: Then each vertex A looks at all the reduce(A,{m1,m2,...,mk})

invocation
messages it has received in step #2
reduce() emits an
• Step #4: Finally, each vertex can update its
(A,sA')
local state to some other value sA' if it wants to
– This would be a natural fit for many graph algorithms!
25
Recap: MapReduce on graphs

• Suppose we want to:

– compute a function for each vertex in a graph...
– ... using data from vertices at most k hops away

• We can do this as follows:

– "Push" information along the edges
• "Think like a vertex"
– Finally, perform the computation at each vertex

• May need more than one MapReduce phase

– Iterative MapReduce: Outputs of stage i  inputs of stage i+1

26
Plan for today
• Representing data in graphs
• Graph algorithms in MapReduce
– Computation model
– Iterative MapReduce
• A toolbox of algorithms NEXT

– Single-source shortest path (SSSP)

– k-means clustering
– Classification with Naïve Bayes

27
Path-based algorithms

• Sometimes our goal is to compute information about

the paths (sets of paths) between nodes
– Edges may be annotated with cost, distance, or similarity

• Examples of such problems:

– Shortest path from one node to another
– Minimum spanning tree (minimal-cost tree connecting all
vertices in a graph)
– Steiner tree (minimal-cost tree connecting certain nodes)
– Topological sort (node in a DAG comes before all nodes it
points to)

28
Single-Source Shortest Path (SSSP)
Given a directed graph G = (V, E) in which each edge e has a cost c(e):
 Compute the cost of reaching each node from the source node s in

the most efficient way (potentially after multiple 'hops')

a b
1
? ?

2 3 9
4 6
s 0

5 7

? ?
2
c d 29
SSSP: Intuition
• We can formulate the problem using induction
– The shortest path follows the principle of optimality: the
last step (u,v) makes use of the shortest path to u

• We can express this as follows:

bestDistanceAndPath(v) {
if (v == source) then {
return <distance 0, path [v]>
} else {
find argmin_u (bestDistanceAndPath[u] + dist[u,v])
return <bestDistanceAndPath[u] + dist[u,v], path[u] + v>
}
}

30
SSSP: traditional solution

• Traditional approach: Dijkstra's algorithm

V: vertices, E: edges, S: start node

foreach v in V
dist_S_to[v] := infinity Initialize length and
predecessor[v] = nil last step of path
spSet = {} to default values
Q := V
while (Q not empty) do Update length and
u := Q.removeNodeClosestTo(S) path based on edges
spSet := spSet + {u} radiating from u
foreach v in V where (u,v) in E
if (dist_S_To[v] > dist_S_To[u]+cost(u,v)) then
dist_S_To[v] = dist_S_To[u] + cost(u,v)
predecessor[v] = u

31
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action
a 1 b
∞ ∞

2 3 9
4 6
s 0

5 7

∞ ∞
c 2 d

Q = {s,a,b,c,d} spSet = {}
dist_S_To: {(a,∞), (b,∞), (c,∞), (d,∞)}
predecessor: {(a,nil), (b,nil), (c,nil), (d,nil)}
32
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action

a 1 b
10 ∞

2 3 9
4 6
s 0

5 7

5 ∞
c 2 d

Q = {a,b,c,d} spSet = {s}

dist_S_To: {(a,10), (b,∞), (c,5), (d,∞)}
predecessor: {(a,s), (b,nil), (c,s), (d,nil)} 33
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action

a 1 b
8 14

2 3 9
4 6
s 0

5 7

5 7
c 2 d

Q = {a,b,d} spSet = {c,s}

dist_S_To: {(a,8), (b,14), (c,5), (d,7)}
predecessor: {(a,c), (b,c), (c,s), (d,c)} 34
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action

a 1 b
8 13

2 3 9
4 6
s 0

5 7

5 7
c 2 d

Q = {a,b} spSet = {c,d,s}

dist_S_To: {(a,8), (b,13), (c,5), (d,7)}
predecessor: {(a,c), (b,d), (c,s), (d,c)} 35
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action

a 1 b
8 9

2 3 9
4 6
s 0

5 7

5 7
c 2 d

Q = {b} spSet = {a,c,d,s}

dist_S_To: {(a,8), (b,9), (c,5), (d,7)}
predecessor: {(a,c), (b,a), (c,s), (d,c)} 36
Example from CLR 2nd ed. p. 528
SSSP: Dijkstra in Action
a 1 b
8 9

2 3 9
4 6
s 0

5 7

5 7
c 2 d

Q = {} spSet = {a,b,c,d,s}
dist_S_To: {(a,8), (b,9), (c,5), (d,7)}
predecessor: {(a,c), (b,a), (c,s), (d,c)}
37
SSSP: How to parallelize?

• Dijkstra traverses the graph along a single route at a

time, prioritizing its traversal to the next step based
on total path length (and avoiding cycles)
? ?
– No real parallelism to be had here!

• Intuitively, we want something s 0

that “radiates” from the origin,

one “edge hop distance” at a time ? ?

– Each step outwards can be done in parallel, before another

iteration occurs - or we are done
– Recall our earlier discussion: Scalability depends on the
algorithm, not (just) on the problem!

38
SSSP: Revisiting the inductive definition

bestDistanceAndPath(v) {
if (v == source) then {
return <distance 0, path [v]>
} else {
find argmin_u (bestDistanceAndPath[u] + dist[u,v])
return <bestDistanceAndPath[u] + dist[u,v], path[u] + v>
}
}

• Dijkstra’s algorithm carefully considered each u in a

way that allowed us to prune certain points
• Instead we can look at all potential u’s for each v
– Compute iteratively, by keeping a “frontier set” of u nodes i
edge-hops from the source

39
SSSP: MapReduce formulation
The shortest path we have found so far ... this is the next ... and here is the adjacency
• init: from the source to nodeID has length ... hop on that path... list for nodeID

– For each node, node ID  <, -, {<succ-node-ID,edge-cost>}>

• map:
– take node ID  <dist, next, {<succ-node-ID,edge-cost>}>
– For each succ-node-ID: This is a new path from
the source to succ-node-ID
• emit succ-node ID  {<node ID, distance+edge-cost>}
that we just discovered
(not necessarily shortest)
– emit node ID  distance,{<succ-node-ID,edge-cost>}
• reduce:
Why is this necessary?
– distance := min cost from a predecessor; next := that predec.
– emit node ID  <distance, next, {<succ-node-ID,edge-cost>}>
• Repeat until no changes
• Postprocessing: Remove adjacency lists
40
Example: SSSP – Parallel BFS in
MapReduce
• Adjacency matrix a b
s a b c d 1
 
s 10 5
a 1 2 10
s
b 4
c 3 9 2 0 2 3 9 4 6
d 7 6

• Adjacency List 5 7
s: (a, 10), (c, 5)  
a: (b, 1), (c, 2) 2
b: (d, 4)
c d
c: (a, 3), (b, 9), (d, 2)
d: (s, 7), (b, 6)

41
Iteration 0: Base case

mapper: (a,<s,10>) (c,<s,5>) edges

reducer: (a,<10, ...>) (c,<5, ...>)

a 1 b
∞ ∞

"Wave" 10

2 3 9
4 6
s 0

5 7

∞ ∞
c 2 d 42
Iteration 0– Parallel BFS in MapReduce
• Map input: <node ID, <dist, adj list>> a b
<s, <0, <(a, 10), (c, 5)>>>
 1 
<a, <inf, <(b, 1), (c, 2)>>>
<b, <inf, <(d, 4)>>> 10
<c, <inf, <(a, 3), (b, 9), (d, 2)>>>
s
<d, <inf, <(s, 7), (b, 6)>>> 0 2 3 9 4 6

5 7
• Map output: <dest node ID, dist>
<a, 10> <c, 5>  
<b, inf> <c, inf> 2
<d, inf> <s, <0, <(a, 10), (c, 5)>>>
<a, inf> <b, inf> <d, inf> c d
<a, <inf, <(b, 1), (c, 2)>>>
<s, inf> <b, inf>
<b, <inf, <(d, 4)>>>
<c, <inf, <(a, 3), (b, 9), (d, 2)>>>
<d, <inf, <(s, 7), (b, 6)>>>
43
Iteration 0 – Parallel BFS in MapReduce
• Reduce input: <node ID, dist> a b
<s, <0, <(a, 10), (c, 5)>>>
 1 
<s, inf>
10
<a, <inf, <(b, 1), (c, 2)>>> s
<a, 10> <a, inf>
0 2 3 9 4 6
<b, <inf, <(d, 4)>>>
<b, inf> <b, inf> <b, inf> 5 7

<c, <inf, <(a, 3), (b, 9), (d, 2)>>>  

2
<c, 5> <c, inf>
c d
<d, <inf, <(s, 7), (b, 6)>>>
<d, inf> <d, inf>

44
Iteration 0– Parallel BFS in MapReduce
• Reduce input: <node ID, dist> a b
<s, <0, <(a, 10), (c, 5)>>>
 1 
<s, inf>

10
<a, <inf, <(b, 1), (c, 2)>>> s
<a, 10> <a, inf>
0 2 3 9 4 6
<b, <inf, <(d, 4)>>>

<b, inf> <b, inf> <b, inf>

5 7
<c, <inf, <(a, 3), (b, 9), (d, 2)>>>  
<c, 5> <c, inf> 2
c d
<d, <inf, <(s, 7), (b, 6)>>>

<d, inf> <d, inf>

45
Iteration 1

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

(b,<c,14>) (d,<c,7>) edges
reducer: (a,<8, ...>) (c,<5, ...>) (b,<11, ...>) (d,<7, ...>)
"Wave" b
a 1
10 ∞

2 3 9
4 6
s 0

5 7

5 ∞
c 2 d 46
Iteration 1– Parallel BFS in MapReduce
• Reduce output: <node ID, <dist, adj list>>
a b
= Map input for next iteration
10 1 
<s, <0, <(a, 10), (c, 5)>>>
<a, <10, <(b, 1), (c, 2)>>> 10
s
<b, <inf, <(d, 4)>>>
<c, <5, <(a, 3), (b, 9), (d, 2)>>> 0 2 3 9 4 6
<d, <inf, <(s, 7), (b, 6)>>>

• Map output: <dest node ID, dist> 5 7

<a, 10> <c, 5> 5 
<s, <0, <(a, 10), (c, 5)>>>
<b, 11> <c, 12>
2
<a, <10, <(b, 1), (c, 2)>>> c d
<d, inf>
<b, <inf, <(d, 4)>>>
<a, 8> <b, 14> <d, 7>
<c, <5, <(a, 3), (b, 9), (d, 2)>>>
<s, inf> <b, inf>
<d, <inf, <(s, 7), (b, 6)>>>
47
Iteration 1 – Parallel BFS in MapReduce
• Reduce input: <node ID, dist> a b
<s, <0, <(a, 10), (c, 5)>>>
10 1 
<s, inf>
10
<a, <10, <(b, 1), (c, 2)>>> s
<a, 10> <a, 8> 0 2 3 9 4 6

<b, <inf, <(d, 4)>>>

<b, 11> <b, 14> <b, inf> 5 7

5 
<c, <5, <(a, 3), (b, 9), (d, 2)>>> 2
<c, 5> <c, 12> c d

<d, <inf, <(s, 7), (b, 6)>>>

<d, inf> <d, 7>
48
Iteration 1– Parallel BFS in MapReduce
• Reduce input: <node ID, dist> a b
<s, <0, <(a, 10), (c, 5)>>>
10 1 
<s, inf>
10
<a, <10, <(b, 1), (c, 2)>>> s
<a, 10> <a, 8> 0 2 3 9 4 6

<b, <inf, <(d, 4)>>>

<b, 11> <b, 14> <b, inf> 5 7

5 
<c, <5, <(a, 3), (b, 9), (d, 2)>>> 2
<c, 5> <c, 12> c d

<d, <inf, <(s, 7), (b, 6)>>>

<d, inf> <d, 7>
49
Iteration 2

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

(b,<c,14>) (d,<c,7>) (b,<d,13>) (d,<b,15>) edges
reducer: (a,<8>) (c,<5>) (b,<11>) (d,<7>)

b "Wave"
a 1
8 11

2 3 9
4 6
s 0

5 7

5 7
c 2 d 50
Iteration 2– Parallel BFS in MapReduce
• Reduce output: <node ID, <dist, adj list>>
= Map input for next iteration a b
<s, <0, <(a, 10), (c, 5)>>>
8 1 11
<a, <8, <(b, 1), (c, 2)>>>
<b, <11, <(d, 4)>>> 10
s
<c, <5, <(a, 3), (b, 9), (d, 2)>>>
0 2 3 9 4 6
<d, <7, <(s, 7), (b, 6)>>>

5 7
… the rest omitted …
5 7
2
c d

51
Iteration 3 No change!
Convergence!

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

(b,<c,14>) (d,<c,7>) (b,<d,13>) (d,<b,15>) edges
reducer: (a,<8>) (c,<5>) (b,<11>) (d,<7>)

a 1 b
8 11

2 3 9
4 6
s 0

5 7
Question: If a vertex's path cost
is the same in two consecutive 5 7
rounds, can we be sure that c 2 d 52
this vertex has converged?
BFS Pseudo-Code
Stopping Criterion
• How many iterations are needed in parallel BFS (equal
edge weight case)?
• Convince yourself: when a node is first “discovered”,
we’ve found the shortest path
• Now answer the question...
– Six degrees of separation?
• Practicalities of implementation in MapReduce
Comparison to Dijkstra
• Dijkstra’s algorithm is more efficient
– At any step it only pursues edges from the minimum-cost path
inside the frontier
• MapReduce explores all paths in parallel
– Lots of “waste”
– Useful work is only done at the “frontier”
• Why can’t we do better using MapReduce?
Summary: SSSP

• Path-based algorithms typically involve iterative

map/reduce
• They are typically formulated in a way that
traverses in “waves” or “stages”, like breadth-first
search
– This allows for parallelism
– They need a way to test for convergence
• Example: Single-source shortest path (SSSP)
– Original Dijkstra formulation is hard to parallelize
– But we can make it work with the "wave" approach

Shimon Even Graph Algorithms Computer Software Engineering Series
100% (1)
Shimon Even Graph Algorithms Computer Software Engineering Series
260 pages
Nets 212: Scalable and Cloud Computing: Graph Algorithms in Mapreduce October 15, 2013
No ratings yet
Nets 212: Scalable and Cloud Computing: Graph Algorithms in Mapreduce October 15, 2013
61 pages
Unit - 4
No ratings yet
Unit - 4
22 pages
Algorithms
No ratings yet
Algorithms
49 pages
Graph Based Data Science
No ratings yet
Graph Based Data Science
37 pages
Graphs: Erin Keith
No ratings yet
Graphs: Erin Keith
52 pages
Algorithm Design Unit 3
No ratings yet
Algorithm Design Unit 3
4 pages
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
No ratings yet
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
33 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
14 pages
W-09 - L-1 - Introduction To Graph Algorithms and Graph Representation
No ratings yet
W-09 - L-1 - Introduction To Graph Algorithms and Graph Representation
26 pages
40 Graphs
No ratings yet
40 Graphs
25 pages
Final Graph
No ratings yet
Final Graph
77 pages
Lec1 Graph
No ratings yet
Lec1 Graph
42 pages
Graphs
No ratings yet
Graphs
5 pages
Graph Mining: Anuraj Mohan 13MZ01, CSED
No ratings yet
Graph Mining: Anuraj Mohan 13MZ01, CSED
50 pages
Topic 1 - Data Modelling
No ratings yet
Topic 1 - Data Modelling
12 pages
9.chapter 2
No ratings yet
9.chapter 2
12 pages
Week 12 Graph Theory 05052025 020226pm
No ratings yet
Week 12 Graph Theory 05052025 020226pm
20 pages
Lecture 12 Graph
No ratings yet
Lecture 12 Graph
57 pages
Topic 1 - Graphs
No ratings yet
Topic 1 - Graphs
14 pages
11 Graphs Intro
No ratings yet
11 Graphs Intro
33 pages
Chapter 10
No ratings yet
Chapter 10
83 pages
Topic 5 Graphs
No ratings yet
Topic 5 Graphs
30 pages
Dsa Lecture 14 Graphs
No ratings yet
Dsa Lecture 14 Graphs
39 pages
14 MapReduce
100% (1)
14 MapReduce
82 pages
Data Structure Algo Graph Classroom - Discussion Notes
No ratings yet
Data Structure Algo Graph Classroom - Discussion Notes
10 pages
14 MapReduce PDF
100% (1)
14 MapReduce PDF
82 pages
Graph
No ratings yet
Graph
17 pages
Graph Mining Handout
No ratings yet
Graph Mining Handout
7 pages
Assignment - 03 - DSA
No ratings yet
Assignment - 03 - DSA
10 pages
Lecture 1
No ratings yet
Lecture 1
51 pages
Graphs
No ratings yet
Graphs
39 pages
Lecture 8nnn
No ratings yet
Lecture 8nnn
16 pages
14 GNN
No ratings yet
14 GNN
58 pages
Graph in Datastructure
No ratings yet
Graph in Datastructure
34 pages
Graphs Lectures
No ratings yet
Graphs Lectures
44 pages
Lecture12 The Foundations - Graphs
No ratings yet
Lecture12 The Foundations - Graphs
36 pages
Raphs: - Definitions - The Graph ADT - Data Structures For Graphs
No ratings yet
Raphs: - Definitions - The Graph ADT - Data Structures For Graphs
27 pages
DAA (Graph Theory)
No ratings yet
DAA (Graph Theory)
17 pages
Week 16
No ratings yet
Week 16
47 pages
HND in Computing and Software Engineering: Lesson 16 - Graph Data Structure
No ratings yet
HND in Computing and Software Engineering: Lesson 16 - Graph Data Structure
40 pages
Presentation Group Seven
No ratings yet
Presentation Group Seven
21 pages
LECTURE NOTES 1 - Introduction To Graphs To Post
No ratings yet
LECTURE NOTES 1 - Introduction To Graphs To Post
67 pages
Datastructure 5
No ratings yet
Datastructure 5
34 pages
Mathematics-2
No ratings yet
Mathematics-2
10 pages
10 Graph Neural Networks v2.2
No ratings yet
10 Graph Neural Networks v2.2
61 pages
Discrete Maths and Graph Theory Mindmap
No ratings yet
Discrete Maths and Graph Theory Mindmap
1 page
Lecture 8 Graph Databases
No ratings yet
Lecture 8 Graph Databases
77 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
21 pages
Lec 32
No ratings yet
Lec 32
25 pages
Unit 4 Graph
No ratings yet
Unit 4 Graph
16 pages
CH 13 Graphs
No ratings yet
CH 13 Graphs
32 pages
PPt 2.8 graphs
No ratings yet
PPt 2.8 graphs
16 pages
41 Undirected Graphs
No ratings yet
41 Undirected Graphs
73 pages
Slot17 18 19 Graphs SuTV 202006
No ratings yet
Slot17 18 19 Graphs SuTV 202006
100 pages
Chapter 7 Graphs
No ratings yet
Chapter 7 Graphs
51 pages
ADSA_3
No ratings yet
ADSA_3
88 pages
04 Graph1
No ratings yet
04 Graph1
27 pages
Documents 2025-3 (v2) GNN (Node Classification) GNN Classification v2
No ratings yet
Documents 2025-3 (v2) GNN (Node Classification) GNN Classification v2
74 pages
Kitten
From Everand
Kitten
Phil X
No ratings yet
AI Algorithms: Foundations, Applications, and Advancements
From Everand
AI Algorithms: Foundations, Applications, and Advancements
Anand Vemula
No ratings yet
Talal Ahmed Shahid - Final Thesis
No ratings yet
Talal Ahmed Shahid - Final Thesis
72 pages
Lecture 23 (OBST, Knapsack) (17 Files Merged)
No ratings yet
Lecture 23 (OBST, Knapsack) (17 Files Merged)
350 pages
DAA Unit Wise Importtant Questions
100% (4)
DAA Unit Wise Importtant Questions
2 pages
CS502 Mcq's FinalTerm by Vu Topper RM
No ratings yet
CS502 Mcq's FinalTerm by Vu Topper RM
53 pages
253 PDF
No ratings yet
253 PDF
52 pages
ISL: Experiment 2: Aim: Implementation of Any AI Problem Using The Uninformed Search
No ratings yet
ISL: Experiment 2: Aim: Implementation of Any AI Problem Using The Uninformed Search
12 pages
WB - Algorithms - 2017-18 (1) 830HRS
No ratings yet
WB - Algorithms - 2017-18 (1) 830HRS
95 pages
RITIK DESWAL (19CSE043) DAA PRACTICAL FILE - Ruchi CSE
No ratings yet
RITIK DESWAL (19CSE043) DAA PRACTICAL FILE - Ruchi CSE
32 pages
Unit-3: Components of Greedy Algorithm
No ratings yet
Unit-3: Components of Greedy Algorithm
14 pages
Artificial Intelligence For Games 1st Edition by Ian Millington 0124977820 978-0124977822
100% (6)
Artificial Intelligence For Games 1st Edition by Ian Millington 0124977820 978-0124977822
76 pages
Group 1's Assignment: Welcome To Our
No ratings yet
Group 1's Assignment: Welcome To Our
11 pages
Network Modelling For Road-Based Faecal Sludge Management
No ratings yet
Network Modelling For Road-Based Faecal Sludge Management
9 pages
Bca IV Sem Ada QB
No ratings yet
Bca IV Sem Ada QB
19 pages
GRAPH ALGORITHMS - MST
No ratings yet
GRAPH ALGORITHMS - MST
49 pages
Path Finder Report
No ratings yet
Path Finder Report
65 pages
Leetcode
No ratings yet
Leetcode
3 pages
Mesh Topology - A Deep Dive Into Principles and Applications
No ratings yet
Mesh Topology - A Deep Dive Into Principles and Applications
32 pages
ADS PPT A Division Grp18
No ratings yet
ADS PPT A Division Grp18
19 pages
BE CSE 5th Semester Batch 2017
No ratings yet
BE CSE 5th Semester Batch 2017
32 pages
CSM 131 Chapter 5 to Chapter 7-الهندسة PDF
No ratings yet
CSM 131 Chapter 5 to Chapter 7-الهندسة PDF
67 pages
15 Dijkstra
No ratings yet
15 Dijkstra
48 pages
Unit 2 Problem Solving Methods
No ratings yet
Unit 2 Problem Solving Methods
43 pages
Daa Lab Manual 2018
No ratings yet
Daa Lab Manual 2018
58 pages
Enhancing Disaster Response Planning in Nabua Camarines Sur Through Map Visualization and Dijkstra S Algorithm
No ratings yet
Enhancing Disaster Response Planning in Nabua Camarines Sur Through Map Visualization and Dijkstra S Algorithm
75 pages
DAA Chapter 3
No ratings yet
DAA Chapter 3
48 pages
Algorithms Worksheet 6
No ratings yet
Algorithms Worksheet 6
4 pages
DSA DAY 6 - Graphs
67% (3)
DSA DAY 6 - Graphs
45 pages
5-Informed Search Methods - Best First Search-01!02!2024
No ratings yet
5-Informed Search Methods - Best First Search-01!02!2024
86 pages
New Performance-Driven FPGA Routing Algorithms: Michael J. Alexander,, and Gabriel Robins
No ratings yet
New Performance-Driven FPGA Routing Algorithms: Michael J. Alexander,, and Gabriel Robins
13 pages

L4-GraphAlgorithms v4

Uploaded by

L4-GraphAlgorithms v4

Uploaded by

CS 4407

Graph algorithms in MapReduce

Lecture adapted from:NETS 212: Scalable and Cloud Computing 1

• Useful for data with limited structure

• But that’s not enough…

• For the next ~2 lectures we’ll start to build this toolbox

• Graph algorithms in MapReduce

Alice Sunita Jose

• We can represent related objects as a labeled,

Alice Sunita Jose

• Recall basic definition of a graph:

Alice Sunita Jose

Alice Sunita Jose

(Alice, fan-of, Facebook)

(Alice, fan-of, 0.5, Facebook)

• We can represent the relationships between related

• We can annotate this graph in various ways

• We can encode the graph in various ways

• Once the data is encoded in this way, we can

• Example: Am I my friends' best friend?

map(key: node, value: [<otherNode, relType, strength>])

• Using adjacency list representation?

map(key: node, value: <otherNode, relType, strength>)

• Using single-edge data representation?

• Example: Am I my friends' best friend?

• Example: Am I my friends' best friend?

• Example: Am I my friends' best friend?

• Example: Am I my friends' best friend?

• How about distance k>2?

• To compute the answer, we need to run multiple

copy files from input dir  staging dir 1

while (!terminating condition) {

• Note that reduce output must be compatible with

• A centralized algorithm typically traverses a tree or

• Most algorithms that are based on graphs make use

– The computation proceeds in rounds. MapReduce

• Step #3: Then each vertex A looks at all the reduce(A,{m1,m2,...,mk})

• Suppose we want to:

• We can do this as follows:

• May need more than one MapReduce phase

– Single-source shortest path (SSSP)

• Sometimes our goal is to compute information about

• Examples of such problems:

the most efficient way (potentially after multiple 'hops')

• We can express this as follows:

• Traditional approach: Dijkstra's algorithm

Q = {a,b,c,d} spSet = {s}

Q = {a,b,d} spSet = {c,s}

Q = {a,b} spSet = {c,d,s}

Q = {b} spSet = {a,c,d,s}

• Dijkstra traverses the graph along a single route at a

• Intuitively, we want something s 0

that “radiates” from the origin,

– Each step outwards can be done in parallel, before another

• Dijkstra’s algorithm carefully considered each u in a

– For each node, node ID  <, -, {<succ-node-ID,edge-cost>}>

mapper: (a,<s,10>) (c,<s,5>) edges

reducer: (a,<10, ...>) (c,<5, ...>)

<c, <inf, <(a, 3), (b, 9), (d, 2)>>>  

<b, inf> <b, inf> <b, inf>

<d, inf> <d, inf>

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

• Map output: <dest node ID, dist> 5 7

<b, <inf, <(d, 4)>>>

<d, <inf, <(s, 7), (b, 6)>>>

<b, <inf, <(d, 4)>>>

<d, <inf, <(s, 7), (b, 6)>>>

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>)

• Path-based algorithms typically involve iterative

You might also like