Graphalgorithms-Bfs and Dfs
Graphalgorithms-Bfs and Dfs
Graphalgorithms-Bfs and Dfs
Instructor
Kancheepuram, Chennai 600 127, India
N.Sadagopan
An Autonomous Institute under MHRD, Govt of India
Scribe:
http://www.iiitdm.ac.in
Dhanalakshmi S
COM 209T Design and Analysis of Algorithms -Lecture Notes
Graph Algorithms
Objective: In this module, we shall introduce graphs which are a powerful model for modelling combi-
natorial problems in computing. We also discuss graph traversals, namely, Breadth First Search (BFS) and
Depth First Search (DFS) and their applications.
1 Preliminaries
Definition 1 (Graph) A graph G is a pair G = (V (G), E(G)) consisting of a finite set V (G) 6= ∅ and
a set E(G) of two-element subsets of V (G). The elements of V (G) are called vertices of G. An element
e = {a, b} of E(G) is called an edge of G with end vertices a and b.
a
Vertex set = { a,b,c,d }
Edge set = { {a,c}, {b,c}, {c,d}, {b,d}}
c
d
b
Note that, graph is a non-linear data structure. A graph is said to be simple if it has no multiple edges
between any two vertices and no self loops (an edge from a vertex to itself). Throughout this lecture, we
will look at only finite and simple graphs.
e e e
f f f
a a a
c c
c
b d d d
Figure 2: A graph G, a sub graph of G on the vertex set {a, c, d, e, f } and an induced graph on the vertex
set {a, c, d, e, f }
a
x y
a z
b d
Definition 3 (Connected Graph) Two vertices u and v of a graph G are said to be connected if there
exists a path from u to v in G. A graph G is said to be connected if every pair of its vertices are connected.
b d
Figure 4: An example for a disconnected graph. The graph has two connected components, one is a graph
induced on the vertex set {a, b, c} and the other is {d}
Definition 6 (Acyclic Graph) A graph that contains no cycles is called acyclic graph.
Definition 7 (Tree) A connected acyclic graph is called a tree. It is a spanning tree of a graph G if it
spans G (that is, it includes every vertex of G) and is a sub-graph of G (every edge in the tree belongs to G).
• In a tree, we can find only one path for every pair of its vertices.
• Tree is a non-linear data structure.
Definition 8 (Bipartite Graph) A graph G is called a bigraph or bipartite graph if V can be parti-
tioned into two disjoint subsets V1 and V2 such that every line of G joins a point of V1 to a point of V2 .
(V1 , V2 ) is called a bipartition of G.
2
a
a
b e e
b
g g
c
c f
d f
d
Graph G Tree T
6
2 3 2 3
7 4
4
9
12 11 8 12 11 8
10 9 5 6 7
10
13 14 14
13
15 16 15 16
Time complexity: O(n) effort is spent in initializing the boolean array to keep track of which node is
visited/unvisited. As part of BFS procedure, each edge is visited at most once, therefore, the total time is
O(n + m), where n → vertices and m → edges of a graph G.
E(G) denotes the edges in a graph G, E(T ) denotes the edges in the BFS tree T of G. The set
En = E(G)\E(T ) denotes the set of non-tree edges i.e., the edges which are in the graph G but not in
the tree T .
3
• Cross edge if u ∈ Li and v ∈ Li (i.e., both the vertices are in same level). Let Ec denotes the set of all
cross edges in T .
• Slanting edge if u ∈ Li and v ∈ Lj , j = i + 1 or j = i − 1 (i.e., both the vertices are in adjacent levels).
Let Es denotes the set of all cross edges in T .
4
• Test for existence of C4 :
Problem: Given a graph G, find whether G contains a C4 or not?
Solution using BFS: Run BFS(G) and let En denote the set of non-tree edges. For each e =
{u, v} ∈ En , construct two sets A = NG0 (u)\NG (v) and B = NG0 (v)\NG (u), where G0 = G\e. Now,
for every element x in A, check whether it has a neighbor y in B or not. If it has a neighbor then the
set {u, x, y, v} form a C4 . Time complexity: Number of non-tree edges is O(m), number of elements
in A is O(n) and in B is O(n). Thus, this approach takes O(mn2 ) time.
– Mark the non-tree (missing) edges using dotted lines in the BFS tree T.
– Decompose En into Ec and Es .
– We know that, a graph is bipartite if and only if it is odd cycle free. By using the fact in test for
odd cycles: we can conclude that, if Ec = ∅, then G is bipartite.
• Test for 2-colorability:
Problem: Given a graph G, check whether can we color the vertices of a graph G using two colors
such that no two adjacent vertices have the same color.
Solution using BFS: Testing whether a graph is 2-colorable or not is equivalent to testing whether
a graph is bipartite or not.
• Shortest path computation:
Problem: Given a graph G and two vertices u and v, find the shortest path between u and v.
Solution using BFS: Run BFS(G) by having starting vertex as u. Since, T=BFS(G) is a tree, there
exist only one path from u to v and that path is the shortest path from u to v. This takes O(n + m)
time.
• All pairs Shortest path problem:
Problem: Given a graph G, find the shortest path between all pairs of vertices in G.
Solution using BFS: For every vertex v ∈ V (G), Run BFS(G) by having starting vertex as v. For
each BFS tree T , print the path from v to x for all x in T . This takes O(n(n + m)) time.
Remarks:
5
• If |Ec | =
6 ∅ then G is not bipartite.
• If |Ec | = ∅ and |Es | = ∅ then the given graph G is a tree (bipartite).
• The example given below has an even cycle, however, Es = ∅.
1
3 2
2 3 4 5
5
4
6
Graph G DFS Tree T
1
1
8
9
8
2 10
3
3
11
5 6
7 4
7
11
2 9 10
Time complexity: Since each edge is visited at most twice: one during DFS call and the other visit is
during back tracking, the effort is O(n + m), where n represents the number of vertices and m represents
the number of edges of a graph G.
Note: Here the non-tree edges are called as back edges.
7
• Determine the number of connected components:
Problem: Given a graph G, find the number of connected components in G.
Solution using DFS: Run DFS until V (G) = V (T ). The number of calls to DFS determines the
number of connected components in G.
13 12
11
9 7
5
10 6
8
2 3
Figure 8: Articulation Points: vertex 1, vertex 4, vertex 7, vertex 9 and vertex 11; Bridges: {1,4}, {7,8},
{9,10}, {11,12} and {11,13}.
Note:
- If a network does not contain a bridge and an articulation point then it is considered a good network
as it can tolerate single node or single edge failure.
- If G is 2-connected then it can tolerate single node failures but not 2-node failures
- It is not necessary that existence of articulation point implies the existence of bridges.
- In any bridge, atleast one end-point is an articulation point.
- The upper bound for the number of articulation points in a connected graph G with n vertices is n − 2.
- The upper bound for the number of bridges in a connected graph G with n vertices is n − 1.
8
Approach 2: The vertex w in a DFS tree T is said to be an articulation point if there is no back edge
from the descendant vertices of w to the ancestor vertices of w. The root node of T is an articulation
point if degree of the root node in DFS tree is greater than or equal to two. This can be done using
the following algorithm:
Remark: What does for a vertex v, for all child wi , L(wi ) ≥ df n(v) mean ? It means that, there
does not exist a back edge from any descendant of v to any ancestor of v. Moreover, to check whether
a vertex v is an articulation point or not, it is enough to check whether there exist a child for v whose
descendants do not have a back edge to any ancestor of v.
Time Complexity to list all APs: O(n + m) [ O(n + m) for DFS Tree + O(n) for post order
traversal + O(n) for checking whether it is A.P ]
Example 1:
9
12 8
10
11
L(8) = min{6, ∞, ∞} = 6
L(12) = min{10, ∞, ∞} = 10
L(10) = min{8, 5, ∞} = 5
9
L(9) = min{7, 5, ∞} = 5
L(6) = min{4, 1, ∞} = 1
L(5) = min{3, 1, 1} = 1
L(4) = min{2, 1, ∞} = 1
L(3) = min{12, ∞, ∞} = 12
L(2) = min{11, ∞, ∞} = 11
Query 1: v is an articulation point if there exist at least one child of v, say w, such that L(v) ≤ L(w)
?
Counter example: In the example, given above, for the vertex 10, there exist a child 11 such that
L(11) = L(10) = 5, but 10 is not an articulation point.
Query 2: v is an articulation point if there exist at least one child of v, say w, such that df n(v) ≥ L(w)
?
Counter example: In the example, given above, for a vertex 9, there exist a child 10 such that
df n(9) = 7 > L(10) = 5, but 9 is not an articulation point.
Approach 2: The edge {u, v} in a DFS tree is said to be a bridge if there is no back edge from the
descendant vertices of v to u or to the ancestor vertices of u. This can be done using the following
algorithm: By Algorithm 4, for the example given, the bridges are {1,2}, {1,3}, {11,12}, {7,8}.
Query 1: the edge {u, v} is a bridge if (i) {u, v} is a tree edge (ii) df n(u) < df n(v) and (iii)
L(v) = df n(v) ?
10
Since the above condition respects the definition of bridge, and in particular, condition (iii) ensures
there is no back edge from any descendant of v to u or the ancestor of u, the above check indeed works.
Query 2: the edge {u, v} is a bridge if there exist a child w for v such that L(w) ≥ L(v) ?
Counter example: Consider the Example 1: Take an edge {10, 11}, there exist a child 12 for 11 such
that L(12) = 10 > L(11) = 5, but {10, 11} is not a bridge.
2 5 6 2 3
3
9
4
7 4
5 6
8
3
9
7 3
Problem: Given a directed graph G, find all of its strongly connected components.
11
Algorithm 6 Strongly Connected Components (SCC)
Input: Directed Graph G
Output: Strongly Connected Components of G.
Step 1: Run DFS(G) and compute the finish time for all vertices.
Step 2: Find GT and sort the vertex set of G in decreasing order based on its finish time.
Step 3: Run DF S(GT ) from the vertex which has maximum finish time. Do this step repeatedly until
all the vertices in GT are visited at least once (this gives you the collection of SCC).
Step 4: Each forest in DF S(GT ) is a SCC.
4 7 8 1 (1,20)
1
4 (6,19)
(2,5) 2
2 6 9
5
7 (11,18)
(3,4) 3 5
3 (7,10) 8
10 (12,17)
Graph G
(8,9) 6
9 (13,16)
DFS(G)
The second co-ordinate of the
red color ordered pair denotes
the finish time of the corresponding 10 (14,15)
vertex and the first co-ordinate
denotes the dfs number
4 7 8
1
2 6 9
5
3
10
Graph G^T
Step 1: Run DFS from the vertex 1, which has the high finish time. 1
No further vertex to visit. Thus, {1} is a strongly connected component
4
Step 2: Run DFS from the vertex 4, which has the next high finish time.
10
The graph induced on the vertex set {7,8,9,10}
forms a strongly connected component
9
8
Step 4: Run DFS from the vertex 2, which has the next high finish time. 2
No further vertex to visit. Thus, {2} is a strongly connected component
12
Step 5: Run DFS from the vertex 3, which has the next high finish time. 3
No further vertex to visit. Thus, {3} is a strongly connected component
1.4 Topological Sorting
Problem: How long does it take to complete a B.Tech programme ?
Description: Every B.Tech student in IIITD&M has to complete the set of 55 courses in the curriculum to
get their degree certificate. If you are given a chance to do as many courses as possible in a semester with
a constraint: some courses has a pre-requisite course, which has to be completed in the previous semesters,
what is the minimum number of semesters to complete all 55 courses ? (The question maximum is invalid
because one can do one course in a semester to reach the maximum number)
Strategy 1: Construct a pre-requisite graph on 55 courses. i.e., construct a graph with 55 vertices (each
vertex corresponds to a course) and an directed edge (u, v) ∈ E(G) if the course u is the pre-requisite for
the course v. Now, remove the vertices of in-degree zero and add the corresponding courses in semester 1.
Repeat this process until you have visited all the vertices in the graph G, to get the minimum number of
semesters. An example is illustrated below:
2 3 2 3
4
4 8 4 8
8
5
5 Vertices with in-degree zero = {1} Vertices with in-degree zero = {2,3}
5
Semester 1 = {1} Semester 2 = {2,3}
6
6 6
7
5
6
Vertices with in-degree zero = {5} Vertices with in-degree zero = {6} 7
Semester 4 = {5} Semester 5 = {6}
6
7 Vertices with in-degree zero = {7}
Semester 6 = {7}
7
Strategy 2: Run DFS and compute the height for each forest in the DFS. Pick the maximum height. This
strategy fails because of the following counter example:
13
1 1
3 3 2
2
6 4
4 8 8
5
5 7
6 DFS(G)
A pre-requisite graph G
for eight courses
Figure 9: Course 6 has to be done before the course 5, which is a contradiction as course 5 is a pre-requisite
course for course 6
Acknowledgements: Lecture contents presented in this module and subsequent modules are based
on the following text books and most importantly, author has greatly learnt from lectures by algorithm
exponents affiliated to IIT Madras/IMSc; Prof C. Pandu Rangan, Prof N.S.Narayanaswamy, Prof Venkatesh
Raman, and Prof Anurag Mittal. Author sincerely acknowledges all of them. Special thanks to Teaching As-
sistants Mr.Renjith.P and Ms.Dhanalakshmi.S for their sincere and dedicated effort and making this scribe
possible. Author has benefited a lot by teaching this course to senior undergraduate students and junior
undergraduate students who have also contributed to this scribe in many ways. Author sincerely thank all
of them.
References:
1. E.Horowitz, S.Sahni, S.Rajasekaran, Fundamentals of Computer Algorithms, Galgotia Publications.
2. T.H. Cormen, C.E. Leiserson, R.L.Rivest, C.Stein, Introduction to Algorithms, PHI.
3. Sara Baase, A.V.Gelder, Computer Algorithms, Pearson.
14