Unit V Graph Structures
Unit V Graph Structures
Graph ADT – representations of graph – graph traversals – DAG – topological ordering – greedy
algorithms – dynamic programming – shortest paths – minimum spanning trees – introduction to
complexity classes and intractability
Introduction to Graphs
Graph is a non linear data structure; A map is a well-known example of a graph. In a map various
connections are made between the cities. The cities are connected via roads, railway lines and
aerial network. We can assume that the graph is the interconnection of cities by roads. Euler used
graph theory to solve Seven Bridges of Königsberg problem. Is there a possible way to traverse
every bridge exactly once – Euler Tour.
Defining the degree of a vertex to be the number of edges incident to it, Euler showed that there
is a walk starting at any vertex, going through each edge exactly once and terminating at the start
vertex iff the degree of each, vertex is even. A walk which does this is called Eulerian. There is
no Eulerian walk for the Koenigsberg bridge problem as all four vertices are of odd degree.
A graph contains a set of points known as nodes (or vertices) and set of links known as edges
(or Arcs) which connects the vertices.
A graph is defined as Graph is a collection of vertices and arcs which connects vertices in
the graph. A graph G is represented as G = (V, E), where V is set of vertices and E is set of
edges.
Example: graph G can be defined as G = (V , E) Where V = {A,B,C,D,E} and
Graph Terminology
1
1. Vertex : An individual data element of a graph is called as Vertex. Vertex is also known as
node. In above example graph, A, B, C, D & E are known as vertices.
2. Edge : An edge is a connecting link between two vertices. Edge is also known as Arc. An edge
is represented as (starting Vertex, ending Vertex).
In above graph, the link between vertices A and B is
represented as (A,B). Edges are three types:
1. Undirected Edge - An undirected edge is a bidirectional edge. If there is an undirected edge
between vertices A and B then edge (A , B) is equal to edge (B , A).
2. Directed Edge - A directed edge is a unidirectional edge. If there is a directed edge between
vertices A and B then edge (A , B) is not equal to edge (B , A).
2
3. Weighted Edge - A weighted edge is an edge with cost on it.
Types of Graphs
1.Undirected
Graph
A graph with only undirected edges is said to be undirected graph.
2. Directed Graph
3. Complete Graph
A graph in which any V node is adjacent to all other nodes present in the graph is known as a
complete graph. An undirected graph contains the edges that are equal to edges = n(n-1)/2 where
n is the number of vertices present in the graph. The following figure shows a complete graph.
4. Regular Graph
3
Regular graph is the graph in which nodes are adjacent to each other, i.e., each node is
5. Cycle Graph
A graph having cycle is called cycle graph. In this case the first and last nodes are the same. A
closed simple path is a cycle.
4
6. Acyclic Graph
7. Weighted Graph
A graph is said to be weighted if there are some non negative value assigned to each edges
of the graph. The value is equal to the length between two vertices. Weighted graph is also
called a network.
Outgoing Edge
Incoming Edge
Degree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
If there are two undirected edges to have the same end vertices, and for two directed edges
to have the same origin and the same destination. Such edges are called parallel edges or
multiple edges.
Self-loop
Simple Graph
6
Adjacent nodes
When there is an edge from one node to another then these nodes are called adjacent nodes.
Incidence
In an undirected graph the edge between v1 and v2 is incident on node v1 and v2.
Walk
A walk is defined as a finite alternating sequence of vertices and edges, beginning and ending
with vertices, such that each edge is incident with the vertices preceding and following it.
Closed walk
A walk which is to begin and end at the same vertex is called close walk. Otherwise it is an open
walk.
A open walk in which no vertex appears more than once is called a path.
If e1 and e2 be the two edges between the pair of vertices (v1,v3) and (v1,v2) respectively, then
v3 e1 v1 e2 v2 be its path.
Length of a path
The number of edges in a path is called the length of that path. In the following, the length of the
path is 3.
7
An open walk Graph
Circuit
A closed walk in which no vertex (except the initial and the final vertex) appears more than
once is called a circuit.
A circuit having three vertices and three edges.
8
Sub Graph
A graph S is said to be a sub graph of a graph G if all the vertices and all the edges of S are in G,
and each edge of S has the same end vertices in S as in G. A subgraph of G is a graph G’ such
that V(G’) V(G) and E(G’) E(G)
Connected Graph
A graph G is said to be connected if there is at least one path between every pair of vertices in
G. Otherwise,G is disconnected.
This graph is disconnected because the vertex v1 is not connected with the other vertices of the
graph.
Degree
9
In an undirected graph, the number of edges connected to a node is called the degree of that node
or the degree of a node is the number of edges incident on it.
In the above graph, degree of vertex v1 is 1, degree of vertex v2 is 3, degree of v3 and v4 is
2 in a connected graph.
Indegree
The indegree of a node is the number of edges connecting to that node or in other words edges
incident to it.
In the above graph,the indegree of vertices v1, v3 is 2, indegree of vertices v2, v5 is 1 and indegree
of v4 is zero.
10
Outdegree
The outdegree of a node (or vertex) is the number of edges going outside from that node or in other
words the
ADT of Graph:
Structure Graph is
objects: a nonempty set of vertices and a set of undirected edges, where each edge is a
pair of vertices functions: for all graph Graph, v, v1 and v2 Vertices
Graph Create()::=return an empty graph
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges incident to it are
removed
1. Adjacency Matrix
2. Adjacency List
3. Adjacency
Multilists 1.Adjacency
Matrix
In this representation, graph can be represented using a matrix of size total number of vertices by
total number of vertices; means if a graph with 4 vertices can be represented using a matrix of
4X4 size.
11
In this matrix, rows and columns both represent vertices.
This matrix is filled with either 1 or 0. Here, 1 represents there is an edge from row vertex to
column vertex and 0 represents there is no edge from row vertex to column vertex.
Adjacency Matrix : let G = (V, E) with n vertices, n 1. The adjacency matrix of G is a 2-
dimensional n n matrix, A, A(i, j) = 1 iff (vi, vj) E(G) (vi, vj for a diagraph), A(i, j) =
0 otherwise.
example : for undirected graph
12
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a
digraph need not be symmetric.
Merits of Adjacency Matrix:
13
The space needed to represent a graph using adjacency matrix is n2 bits. To identify the
edges in a graph, adjacency matrices will require at least O(n2) time.
2. Adjacency List
In this representation, every vertex of graph contains list of its adjacent vertices. The n rows
of the adjacency matrix are represented as n chains. The nodes in chain I represent the
vertices that are adjacent to vertex i.
It can be represented in two forms. In one form, array is used to store n vertices and chain is
used to store its adjacencies. Example:
So that we can access the adjacency list for any vertex in O(1) time. Adjlist[i] is a pointer to to
first node in the adjacency list for vertex i. Structure is
#define MAX_VERTICES
50 typedef struct node
*node_pointer; typedef struct
node {
int vertex;
struct node *link;
};
node_pointer
graph[MAX_VERTICES]; int
n=0; /* vertices currently in use */
example: consider the following directed graph representation implemented using linked list
14
This representation can also be implemented using array
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
9 11 13 15 17 18 20 22 23 2 1 3 0 0 3 1 2 5 6 4 5 7 6
Graph
Instead of chains, we can use sequential representation into an integer array with size n+2e+1.
For 0<=i<n, Array[i] gives starting point of the list for vertex I, and array[n] is set to n+2e+1.
The adjacent vertices of node I are stored sequentially from array[i].
For an undirected graph with n vertices and e edges, linked adjacency list requires an array of
size n and 2e chain nodes. For a directed graph, the number of list nodes is only e. the out degree
of any vertex may be determined by counting the number of nodes in its adjacency list. To find
in-degree of vertex v, we have to traverse complete list.
To avoid this, inverse adjacency list is used which contain in-degree.
15
3.Adjacency Multilists
In the adjacency-list representation of an undirected graph each edge (u, v) is represented by two
entries one on the list for u and the other on tht list for v. As we shall see in some situations it is
necessary to be able to determin ie ~ nd enty for a particular edge and mark that edg as having
been examined. This can be accomplished easily if the adjacency lists are actually maintained as
multilists (i.e., lists in which nodes may be shared among several lists). For each edge there will
be exactly one node but this node will be in two lists (i.e. the adjacency lists for each of the two
nodes to which it is incident).
For adjacency multilists, node
structure is typedef struct edge
*edge_pointer; typedef struct edge
{
short int
marked; int
vertex1, vertex2;
edge_pointer path1, path2;
};
edge_pointer graph[MAX_VERTICES];
16
Lists: vertex 0: N0->N1->N2, vertex 1: N0-
>N3->N4 vertex 2: N1->N3->N5,
vertex 3: N2->N4->N5
Figure: Adjacency multilists for given graph
4. Weighted edges
In many applications the edges of a graph have weights assigned to them. These weights may
represent the distance from one vertex to another or the cost of going from one; vertex to an
adjacent vertex In these applications the adjacency matrix entries A [i][j] would keep this
information too. When adjacency lists are used the weight information may be kept in the
list’nodes by including an additional field weight. A graph with weighted edges is called a
network.
Given a graph G = (V E) and a vertex v in V(G) we wish to visit all vertices in G that are
reachable from v (i.e., all vertices that are connected to v). We shall look at two ways of doing
17
this: depth-first search and breadth-first search. Although these methods work on both directed
and undirected graphs the following discussion assumes that the graphs are undirected.
Graph Traversal
Depth-First Search
o Otherwise, backtrack
Time complexity
o Adjacency list: O(|E|)
We begin by visiting the start vertex v. Next an unvisited vertex w adjacent to v is selected, and a
depth-first search from w is initiated. When a vertex u is reached such that all its adjacent vertices
have been visited, we back up to the last vertex visited that has an unvisited vertex w adjacent to
it and initiate a depth-first search from w.
The search terminates when no unvisited vertex can be reached from any of the visited vertices.
DFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph
without any loops. We use Stack data structure with maximum size of total number of
vertices in the graph to implement DFS traversal of a graph.
Step 2: Select any vertex as starting point for traversal. Visit that vertex and push it on to the Stack.
Step 3: Visit any one of the adjacent vertex of the verex which is at top of the stack which is not
visited and push it on to the stack.
Step 4: Repeat step 3 until there are no new vertex to be visit from the vertex on top of the stack.
Step 5: When there is no new vertex to be visit then use back tracking and pop one vertex
from the stack. Step 6: Repeat steps 3, 4 and 5 until stack becomes Empty.
Step 7: When stack becomes Empty, then produce final spanning tree by removing unused
edges from the graph This function is best described recursively as in Program.
#define FALSE 0
#define TRUE 1
18
int
visited[MAX_VERTICES]
; void dfs(int v)
{
node_pointer w;
visited[v]=
TRUE;
printf(“%d”, v);
for (w=graph[v]; w; w=w-
>link) if (!visited[w-
>vertex])
dfs(w->vertex);
}
Consider the graph G of Figure 6.16(a), which is represented by its adjacency lists as in Figure
6.16(b). If a depth- first search is initiated from vertex 0 then the vertices of G are visited in the
following order: 0 1 3 7 4 5 2 6. Since DFS(O) visits all vertices that can be reached from 0
the vertices visited, together with all edges in G incident to these vertices form a connected
component of G.
Figure: Graph and its adjacency list representation, DFS spanning tree
Analysis or DFS:
When G is represented by its adjacency lists, the vertices w adjacent to v can be determined by
following a chain of links. Since DFS examines each node in the adjacency lists at most once and
there are 2e list nodes the time to complete the search is O(e). If G is represented by its
adjacency matrix then the time to determine all vertices adjacent to v is O(n). Since at most n
vertices are visited the total time is O(n2).
19
Complexity of Depth First Search
The time complexity of the DFS algorithm is represented in the form of O(V + E), where V is the
number of nodes and E is the number of edges.
The space complexity of the algorithm is O(V).
Breadth-First Search
In a breadth-first search, we begin by visiting the start vertex v. Next all unvisited vertices
adjacent to v are visited. Unvisited vertices adjacent to these newly visited vertices are then
visited and so on.
Algorithm BFS gives the details as below.
typedef struct queue
*queue_pointer;
typedef struct queue {
int vertex;
queue_pointer link;
};
Void
addq(queue_pointe
r *, queue_pointer
*, int);
int
deleteq(queue_pointer*;
void bfs(int v)
{
node_pointer w;
queue_pointer front,
rear; front = rear =
NULL;
printf(“%d”, v);
visited[v] = TRUE;
addq(&front, &rear,
v);
20
while (front) {
v= deleteq(&front);
for (w=graph[v]; w; w=w-
>link) if (!visited[w-
>vertex]) { printf(“%d”,
w->vertex);
addq(&front, &rear, w-
>vertex); visited[w-
>vertex] = TRUE;
}
}
}
Steps:
BFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph
without any loops. We use Queue data structure with maximum size of total number of vertices
in the graph to implement BFS traversal of a graph.
Analysis of Components:
If G is represented by its adjacency lists, then the total time taken by dfs is O(e). Since the for
loops take O(n) time, the total time to generate all the Connected components is O(n+e). If
adjacency matrices are used,then the time required is O(n2)
A strongly connected component is the portion of a directed graph in which there is a path from
each vertex to another vertex. It is applicable only on a directed graph.
For example:
Let us take the graph below.
Initial graph
The strongly connected components of the above graph are:
22
Strongly connected components
Strongly Connected Components Applications
Vehicle routing applications
Maps
Model-checking in formal verification
Topological Sorting
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such
that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a
graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be more than
one topological sorting for a graph.
For example, another topological sorting of the following graph is “4 5 2 3 1 0”. The first
vertex in topological sorting is always a vertex with in-degree as 0 (a vertex with no in-coming
edges).
23
Algorithm for Topological Sorting:
We can modify DFS to find the Topological Sorting of a graph.
In DFS,
We start from a vertex, we first print it, and then
Recursively call DFS for its adjacent vertices.
In topological sorting,
We use a temporary stack.
We don’t print the vertex immediately,
We first recursively call topological sorting for all its adjacent vertices, then push it to a stack.
Finally, print the contents of the stack.
Note: A vertex is pushed to stack only when all of its adjacent vertices (and their adjacent vertices
and so on) are already in the stack
Approach:
Create a stack to store the nodes.
Initialize visited array of size N to keep the record of visited nodes.
Run a loop from 0 till N
if the node is not marked True in visited array
Call the recursive function for topological sort and perform the following steps.
Mark the current node as True in the visited array.
Run a loop on all the nodes which has a directed edge to the current node
if the node is not marked True in the visited array:
Recursively call the topological sort function on the node
Push the current node in the stack.
Print all the elements in the stack.
Shortest Paths
A path from the source vertex to the destination vertex that costs a minimum is the shortest
path or shortest distance. In graph theory, it is possible to have multiple routes from a source to a
destination. Between these routes, if there is a route that costs a minimum amount, we can call it the
shortest path algorithm.
24
Here “Cost” means the number of nodes in the route or the summation of costs on each path. A
path can have one or multiple edges. The connection between two vertices is called “edge”. There are
various types of shortest path algorithms, like Dijkstra’s Algorithm, Bellman-Ford algorithm, etc.
Weighted graph
A weighted graph is a graph that has a numeric (for example, integer) label w(e) associated
with each edge e, called the weight of edge e. For e = (u,v), we let notation w(u,v) = w(e).
Eg: Let’s look at the following weighted Graph:
The term “Weighted” means moving costs from one node to another. For example, moving
from node 1 to node 2, the cost or weight is 1.
The path between node 1 to node 2 is called the edge.
“Undirected” means you can move one node to another and back to the previous node. So, if
we try to find all the routes from node 1 to node 7, then it will be:
Among these four routes, we can see that the first route will cost us 7. So, it is the shortest path in
terms of cost.
Step 2: Choose a starting vertex and assign infinity path values to all other devices.
Step 4:
26
If the path length of the adjacent vertex is lesser than new path length, don't update it
Step 6: After each iteration, we pick the unvisited vertex with the least path length. So we choose 5
before 7
Step 7: Notice how the rightmost vertex has its path length updated twice
27
for each vertex V in G
distance[V] <- infinite
previous[V] <- NULL
If V != S, add V to Priority Queue Q
distance[S] <- 0
while Q IS NOT EMPTY
U <- Extract MIN from Q
for each unvisited neighbour V of U
tempDistance <- distance[U] + edge_weight(U, V)
if tempDistance < distance[V]
distance[V] <- tempDistance
previous[V] <- U
return distance[], previous[]
Dijkstra's Algorithm Complexity
Time Complexity: O(E Log V)
where, E is the number of edges and V is the number of vertices.
Space Complexity: O(V)
Biconnected Components
A bioconnected component of a graph is a connected subgraph that cannot be broken into
disconnected pieces by deleting any single node (and its incident links). An articulation point is a
node of a graph whose removal would cause an increase in the number of connected components.
Biconnected Graph
A graph is a Biconnected graph if:
There is a path from any node to any other node, i.e. the graph must be connected.
After removing any node and all the associated edges from the graph, it still remains connected, i.e. there is
always a path between any two nodes even after removing any node from the graph.
For example consider the graphs below,
28
You can try and remove every possible node from any graph, it always remains connected. And also all of the graphs
are one single component, there is a path from any node to any other node.
Consider the graph shown below, it is not a Biconnected graph because if we remove the node with value 2, it
increases the number of connected components and there is no edge between node 1 and node 5, or node
4 and node 6 etc.
A node whose removal from the graph increases the number of connected components is called an Articulation Point.
In the above graph, the node with value 2 is an articulation point.
A graph can contain multiple articulation points. We will also discuss the algorithm to detect the articulation points in a
graph.
To check whether a graph is Biconnected or not, we mainly need to check two things in a graph,
o The graph is connected, i.e. there must be a path between any two nodes in the graph.
o There is no articulation point in graph, as discussed above that a Biconnected graph has no articulation points.
Biconnected Components
Before Biconnected Components, let's first try to understand what a Biconnected Graph is and how to check if a given graph is
Biconnected or not.
A graph is said to be Biconnected if:
1. It is connected, i.e. it is possible to reach every vertex from every other vertex, by a simple path.
2. Even after removing any vertex the graph remains connected.
For example, consider the graph in the following figure
29
The given graph is clearly connected. Now try removing the vertices one by one and observe. Removing any of the vertices does not
increase the number of connected components. So the given graph is Biconnected.
Now consider the following graph which is a slight modification in the previous graph.
In the above graph if the vertex 2 is removed, then here's how it will look:
30
Clearly the number of connected components have increased. Similarly, if vertex 3 is removed there will be no path left to reach vertex
0 from any of the vertices 1, 2, 4 or 5. And same goes for vertex 4 and 1. Removing vertex 4 will disconnect 1 from all other vertices 0,
2, 3 and 4. So the graph is not Biconnected.
Now what to look for in a graph to check if it's Biconnected. By now it is said that a graph is Biconnected if it has no vertex such that its
removal increases the number of connected components in the graph. And if there exists such a vertex then it is not Biconnected. A
vertex whose removal increases the number of connected components is called an Articulation Point.
So simply check if the given graph has any articulation point or not. If it has no articulation point then it is Biconnected otherwise not.
Here's the pseudo code:
time = 0
function isBiconnected(vertex, adj[][], low[], disc[], parent[], visited[], V)
disc[vertex]=low[vertex]=time+1
time = time + 1
visited[vertex]=true
child = 0
for i = 0 to V
if adj[vertex][i] == true
if visited[i] == false
child = child + 1
parent[i] = vertex
result = isBiconnected(i, adj, low, disc, visited, V, time)
if result == false
return false
low[vertex] = minimum(low[vertex], low[i])
if parent[vertex] == nil AND child > 1
return false
if parent[vertex] != nil AND low[i] >= disc[vertex]
return false
else if parent[vertex] != i
low[vertex] = minimum(disc[i], low[vertex])
return true
31
The code above is exactly same as that for Articulation Point with one difference that it returns false as soon as it finds an Articulation
Point.
The image below shows how the DFS tree will look like for the graph in Fig. 2 according to the algorithm, along with the value of the
arrays low[] and disc[].
Clearly for vertex 4 and its child 1, low[1]≥disc[4], so that means 4 is an articulation point. The algorithm returns false as soon as it
discovers that 4 is an articulation point and will not go on to check low[] for vertices 0, 2 and 3. Value of low[] for all vertices is just
shown for clarification.
Following image shows DFS tree, value of arrays low[] and disc[] for graph in Fig.1
32
Clearly there does not exists any vertex x, such that low[x]≥disc[x], i.e. the graph has no articulation point, so the algorithm returns true,
that means the graph is Biconnected.
Now let's move on to Biconnected Components. For a given graph, a Biconnected Component, is one of its subgraphs which is
Biconnected. For example for the graph given in Fig. 2 following are 4 biconnected components in the graph
33
Biconnected components in a graph can be determined by using the previous algorithm with a slight modification. And that modification
is to maintain a Stack of edges. Keep adding edges to the stack in the order they are visited and when an articulation point is detected
i.e. say a vertex u has a child v such that no vertex in the subtree rooted at v has a back edge (low[v]≥disc[u]) then pop and print all the
edges in the stack till the u−v is found, as all those edges including the edge u−v will form one biconnected component.
time = 0
function DFS(vertex, adj[][], low[], disc[], parent[], visited[], V, stack)
disc[vertex]=low[vertex]=time+1
time = time + 1
visited[vertex]=true
child = 0
for i = 0 to V
if adj[vertex][i] == true
if visited[i] == false
child = child + 1
push edge(u,v) to stack
parent[i] = vertex
DFS(i, adj, low, disc, visited, V, time, stack)
low[vertex] = minimum(low[vertex], low[i])
if parent[vertex] == nil AND child > 1
while last element of stack != (u,v)
print last element of stack
pop from stack
print last element of stack
pop from stack
if parent[vertex] != nil AND low[i] >= disc[vertex]
34
while last element of stack != (u,v)
print last element of stack
pop from stack
print last element of stack
pop from stack
else if parent[vertex] != i AND disc[i] < low[vertex]
low[vertex] = disc[i]
push edge(u,v) to stack
fuction biconnected_components(adj[][], V)
for i = 0 to V
if visited[i] == false
DFS(i, adj, low, disc, parent, visited, V, time, stack)
while stack is not empty
print last element of stack
pop from stack
Then with 3 it finds the edge 3-2 and pushes that in stack
It then discovers the fact that low[1]≥disc[4] i.e. discovers that 4 is an articulation point. So all the edges inserted after the edge 4-1
along with edge 4-1 will form first biconnected component. So it pops and print the edges till last edge is 4-1 and then prints that too and
pop it from the stack
35
Then it discovers the edge 4-5 and pushes that in stack
For 5 it discovers the back edge 5-2 and pushes that in stack
After that no more edge is connected to 5 so it goes back to 4. For 4 also no more edge is connected and also low[5]≱disc[4].
Then it goes back to 2 and discovers that low[4]≥disc[2], that means 2 is an articulation point. That means all the edges inserted after
the edge 4-2 along with the edge 4-2 will form the second biconnected component. So it print and pop all the edges till last edge is 4-2
and then print and pop that too.
Then it goes to 3 and discovers that 3 is an articulation point as low[2]≥disc[3] so it print and pops till last edge is 3-2 and the print and
pop that too. That will form the third biconnected component.
Now finally it discovers that for edge 0-3 also low[3]≥disc[0] so it pops it from the stack and print it as the fourth biconnected component.
Then it checks visited[] value of other vertices and as for all vertices it is true so the algorithm terminates.
So ulitmately the algorithm discovers all the 4 biconnected components shown in Fig.6.
Time complexity of the algorithm is same as that of DFS. If V is the number of vertices and E is the number of edges then complexity
is O(V+E).
36
37
38