Introduction to Algorithms
Class 3: Graph Algorithm 2
Jianxi Gao
Department of Computer Science
Rensselaer Polytechnic Institute
www.gaojianxi.com
About the Lab
Number of shortest paths
3 2 2 2 3
5 4 3 4
5 5 1 1 4 2
1 0 1 5
Number of shortest paths
3 2 2 2 3
5 4 3 4
5 5 1 1 4 2
1 0 1 5
PATHS
A path is a sequence of nodes in which each node is adjacent to the next one.
Pi0,in of length n between nodes i0 and in is an ordered collection of n+1 nodes and n links
Pn = {(i0 ,i1),(i1,i2 ),(i2 ,i3 ),...,(in-1,in )}
Connected graph
B
B
A
A Telephone
Internet
C
transportation
D E C
D E
F
F
G
G Bridge: A-F
Connected Not connected
Def 1: Nodes u and v are connected, 𝐢𝐟 ∃ a path connection u and v.
Def 2: A graph is connected if all nodes are connected to each other.
A disconnected graph is made up by two or more connected components.
Example from the Book
Breadth-first Search
Procedure BFS(𝐺, 𝑠)
Input: Graph 𝐺 = (𝑉, 𝐸), directed or undirected; vertex 𝑠 ∈ 𝑉.
Output: For all vertices u reachable from 𝑠, 𝑑𝑖𝑠𝑡(𝑡) is set to
the distance from 𝑠 to u.
Breadth-first Search
for all u ∈ 𝑉;
𝑑𝑖𝑠𝑡 𝑢 = ∞
𝑑𝑖𝑠𝑡 𝑠 = 0
𝑄 = [𝑠] (queue only contains 𝑠)
while 𝑄 is not empty
𝑢 = 𝑒𝑗𝑒𝑐𝑡(𝑄) (Remove the first element 𝑢 from 𝑄)
for all edges 𝑢, 𝑣 ∈ 𝐸:
if 𝑑𝑖𝑠𝑡 𝑣 = ∞:
𝑖𝑛𝑗𝑒𝑐𝑡 𝑄, 𝑣
𝑑𝑖𝑠𝑡 𝑣 = 𝑑𝑖𝑠𝑡 𝑢 + 1
Meeting princess
Meeting princes
Here are the questions:
(1) Is there any path from Mario to Bell?
(2) Can Mario reach Bell in 4 steps (letters)?
(3) If not, the minimal steps?
Let’s spend 10 mins to help Mario.
Meeting princess
We need to build a graph.
Meeting princess
Meeting princess
Meeting princess
Meeting princess BFS
for all u ∈ 𝑉 = {𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹, 𝐺, 𝐻, 𝐼, 𝐽};
𝑑𝑖𝑠𝑡 𝑢 = ∞
Initialization.
𝑑𝑖𝑠𝑡 𝐺 = 0
𝑄 = [𝐺] (queue only contains 𝐺)
1. 𝑢 = 𝐺,
2. 𝐺, 𝐷 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐷 = ∞
3. 𝑄 = 𝐷 , 𝑑𝑖𝑠𝑡 𝐷 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
4. 𝐺, 𝐻 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐻 = ∞
5. 𝑄 = 𝐷, 𝐻 , 𝑑𝑖𝑠𝑡 𝐻 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
Meeting princess BFS
5. 𝑄 = 𝐷, 𝐻 , 𝑑𝑖𝑠𝑡 𝐻 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
6. 𝑢 = 𝐷,
7. 𝐷, 𝐴 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐴 = ∞
8. 𝑄 = 𝐻, 𝐴 , 𝑑𝑖𝑠𝑡 𝐴 = 𝑑𝑖𝑠𝑡 𝐷 + 1 = 2
9. 𝐷, 𝐻 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐻 = 1 →continue
10.𝑢 = 𝐻
11. 𝐻, 𝐷 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐷 = 1 →continue
12. 𝑄 = 𝐴
Meeting princess BFS
12. 𝑄 = 𝐴
13. 𝑢 = 𝐴,
14. 𝐴, 𝐵 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐵 = ∞
15. 𝑄 = 𝐵 , 𝑑𝑖𝑠𝑡 𝐵 = 𝑑𝑖𝑠𝑡 𝐴 + 1 = 3
16. 𝐴, 𝐶 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐶 = ∞
17. 𝑄 = 𝐵, 𝐶 , 𝑑𝑖𝑠𝑡 𝐶 = 𝑑𝑖𝑠𝑡 𝐴 + 1 = 3
18. 𝐴, 𝐷 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐷 = 1 →continue
19. 𝑄 = 𝐵, 𝐶
Meeting princess BFS
19. 𝑄 = 𝐵, 𝐶 What is the next?
20. 𝑢 = 𝐵,
21. 𝐵, 𝐸 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐸 = ∞
22. 𝑄 = 𝐶, 𝐸 , 𝑑𝑖𝑠𝑡 𝐸 = 𝑑𝑖𝑠𝑡 𝐵 + 1 = 4
23. 𝐵, 𝐹 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐹 = ∞
24. 𝑄 = 𝐶, 𝐸, 𝐹 , 𝑑𝑖𝑠𝑡 𝐹 = 𝑑𝑖𝑠𝑡 𝐵 + 1 = 4
25. 𝐵, 𝐴 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐴 = 3 →continue
26. 𝑄 = 𝐶, 𝐸, 𝐹
Meeting princess BFS
26. 𝑄 = 𝐶, 𝐸, 𝐹
27. 𝑢 = 𝐶,
28. 𝐶, 𝐴 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐴 = 3 →continue
29. 𝐶, 𝐹 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐹 = 4 →continue
30. 𝑄 = 𝐸, 𝐹
Meeting princess BFS
30. 𝑄 = 𝐸, 𝐹
31. 𝑢 = 𝐸,
32. 𝐸, 𝐼 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐼 = ∞
33. 𝑄 = 𝐹, 𝐼 , 𝑑𝑖𝑠𝑡 𝐼 = 𝑑𝑖𝑠𝑡 𝐸 + 1 = 5
34. 𝐸, 𝐽 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐽 = ∞
35. 𝑄 = 𝐹, 𝐼, 𝐽 , 𝑑𝑖𝑠𝑡 𝐽 = 𝑑𝑖𝑠𝑡 𝐸 + 1 = 5
For the rest of the elements in 𝑄 …
Meeting princes
Here are some questions:
(1) Is there any path from Mario to Bell? Yes
(2) Can Mario reach Bell in 4 steps? No
(3) If not, the minimal steps? 5
Main questions about BFS
(1) Is it correct?
Are you kidding me?
(2) Is it efficient?
Can we prove that this algorithm is correct?
Is it correct?
For each d=0,1,2,…, these is a moment at which
(1) all nodes at distance ≤ 𝑑 from 𝑠 have their distance correctly set;
(2) all other nodes have their distance set to ∞;
(3) The queue contains exactly the nodes at distance d.
We will use an induction on distance to prove?
𝑄 = [𝑠] d=0
𝑄 = [𝐴1 , 𝐴2, . . 𝐴𝑘 ] d=1
𝑄 = 𝐵1 , 𝐵2 , … , 𝐵𝑟 d=2
𝑄 = [𝐶1, 𝐶2 , … , 𝐶𝑡 ] d=3
Is it efficient?
The overall running time of this algorithm is linear, 𝑂 𝑉 + 𝐸 .
The Meeting princess problem
1. 𝑢 = 𝐺, Why I only checked 𝐺, 𝐷 𝑎𝑛𝑑 𝐺, 𝐻
2. 𝐺, 𝐷 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐷 = ∞
It depends on how you
3. 𝑄 = 𝐷 , 𝑑𝑖𝑠𝑡 𝐷 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1 represent the graph?
4. 𝐺, 𝐻 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐻 = ∞
5. 𝑄 = 𝐷, 𝐻 , 𝑑𝑖𝑠𝑡 𝐻 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
ADJACENCY MATRIX (Undirected)
Aij=1 if there is a link between node i and j
Links: undirected (symmetrical)
Aij=0 if nodes i and j are not connected.
Graph:
L
A
4
F M
I
D 3
2
B G 1
H
C
Undirected links :
coauthorship links
Actor network
protein interactions
An Example of UNDIRECTED Graphs
Map of scientific collaboration between researchers
ADJACENCY MATRIX
• Space proportional to Θ(n2). (𝑛 = |𝑉|)
• Checking if (u, v) is an edge takes Θ(1) time
• Identify all edges takes Θ(n2) time
ADJACENCY LISTs
Node-indexed array of lists.
• Space proportional to Θ(|V|+|E|).
• Checking if (u, v) is an edge takes Θ(degree(u)) time
• Identify all edges takes Θ(|V|+|E|) time
ADJACENCY MATRIX AND NODE DEGREES
Node degree: the number of links connected to the node.
A
kA =1
Question for a graph 𝐺 = (𝑉, 𝐸):
B What is the total degree of a graph?
kB = 4
æ0 1 0 1ö N
4 ç ÷ ki = åA ij
ç 1 0 0 1÷ j =1
Aij =
ç0 0 0 1÷ N
3 ç ÷ k j = å Aij
2
è1 1 1 0ø i=1
1
Aij = A ji N N
L = å ki = å Aij
1 1
Aii = 0 2 i=1 2 ij
Is it efficient?
The overall running time of this algorithm is linear, 𝑂 𝑉 + 𝐸 .
The Meeting princess problem
1. 𝑢 = 𝐺,
Should we use adjacency matrix
2. 𝐺, 𝐷 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐷 = ∞ or adjacency list?
3. 𝑄 = 𝐷 , 𝑑𝑖𝑠𝑡 𝐷 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
4. 𝐺, 𝐻 ∈ 𝑬, 𝐚𝐧𝐝 𝑑𝑖𝑠𝑡 𝐻 = ∞
5. 𝑄 = 𝐷, 𝐻 , 𝑑𝑖𝑠𝑡 𝐻 = 𝑑𝑖𝑠𝑡 𝐺 + 1 = 1
Average Degree
COMPLETE GRAPH
The maximum number of links a network
of N nodes can have is: Lmax = æç N ö÷ = N(N -1)
è2ø 2
A graph with degree L=Lmax is called a complete graph,
and its average degree is <k>=N-1
REAL NETWORKS ARE SPARSE
Most networks observed in real systems are sparse:
L << Lmax
or
<k> <<N-1.
WWW (ND Sample): N=325,729; L=1.4 106 Lmax=1012 <k>=4.51
Protein (S. Cerevisiae): N= 1,870; L=4,470 Lmax=107 <k>=2.39
Coauthorship (Math): N= 70,975; L=2 105 Lmax=3 1010 <k>=3.9
Movie Actors: N=212,250; L=6 106 Lmax=1.8 1013 <k>=28.78
(Source: Albert, Barabasi, RMP2002)
ADJACENCY MATRICES ARE SPARSE
Is BFS efficient?
for all u ∈ 𝑉; 𝑂(𝑛)
𝑑𝑖𝑠𝑡 𝑢 = ∞
𝑑𝑖𝑠𝑡 𝑠 = 0 𝑂(1) 𝑂(𝑛)
𝑄 = [𝑠] (queue only contains 𝑠) 𝑂(1)
while 𝑄 is not empty 𝑂(𝑛)
𝑂(𝑛)
𝑢 = 𝑒𝑗𝑒𝑐𝑡(𝑄) 𝑂(1)
for all edges 𝑢, 𝑣 ∈ 𝐸: 𝑂(𝑛)
𝑂(𝑛)
if 𝑑𝑖𝑠𝑡 𝑣 = ∞: 𝑂(1)
𝑖𝑛𝑗𝑒𝑐𝑡 𝑄, 𝑣
Looks like 𝑂(𝑛2 ), not linear.
𝑑𝑖𝑠𝑡 𝑣 = 𝑑𝑖𝑠𝑡 𝑣 + 1
Trick!!!
Is BFS efficient?
for all u ∈ 𝑉; 𝑂(𝑛)
𝑑𝑖𝑠𝑡 𝑢 = ∞
𝑑𝑖𝑠𝑡 𝑠 = 0 𝑂(1) 𝑂(𝑛)
𝑄 = [𝑠] (queue only contains 𝑠) 𝑂(1)
while 𝑄 is not empty 𝑂(𝑛)
𝑂(𝑛)
𝑢 = 𝑒𝑗𝑒𝑐𝑡(𝑄) 𝑂(1)
for all edges 𝑢, 𝑣 ∈ 𝐸: 𝑂(𝑑𝑒𝑔𝑟𝑒𝑒(𝑢)) 𝑂(𝑚)
if 𝑑𝑖𝑠𝑡 𝑣 = ∞: 𝑂(1)
𝑖𝑛𝑗𝑒𝑐𝑡 𝑄, 𝑣
𝑑𝑖𝑠𝑡 𝑣 = 𝑑𝑖𝑠𝑡 𝑣 + 1
The overall running time of this algorithm is linear, 𝑂 𝑉 + 𝐸 .
Graph Algorithms
Part II
UNDIRECTED NETWORKS
Notation. G = (V,E)
V = nodes (or vertices).
E = edges (or arcs) between pairs of nodes
Captures pairwise relationship between objects.
Network size parameters: n = |V |, m = |E|.
L
A
F M V = {A,B,C,D,E,F,G,H,I, L,M}
I
E {A-D,B-B,B-D,C-D,C-D,D-F,F-G,G-H,H-I,L-M}
D
B G
H
A-D is the same as D-A
C
n = 10, m = 10.
Some graph applications
Graph Nodes Edges
transportation street intersections highways
communication computers fiber optic cables
World Wide Web web pages hyperlinks
social people relationships
food web species predator-prey
software systems functions function calls
scheduling tasks precedence constraints
circuits gates wires
45
Tress
root r
parent of v
child of v
a tree the same tree, rooted at 1
46
GUI tree
• GUI containment hierarchy. Describe
organization of GUI widgets.
Reference: http://java.sun.com/docs/books/tutorial/uiswing/overview/anatomy.html
47
COVID-19
• It spreads so fast because of huge human
mobility
• The spreading process seems unpredictable
COVID-19
49
COVID-19: travel restrictions
https://science.rpi.edu/computer-science/news/new-covid-19-
50 model-reveals-need-better-travel-restriction-implementation
International airline network
What parts of the graph are reachable
from a given vertex (for example Wuhan)?
Depth-First Search (DFS)
An extra exercise
A network of numbers.
89 72 14 4
98
59 45 20 0
95
What numbers are reachable from 0?
Do you want to use BFS or DFS?