Graph and Implementation of Graph
Graph and Implementation of Graph
Graph and Implementation of Graph
1. Graph
A Graph is a non-linear data structure consisting of a finite set of nodes and edges. The nodes are
referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph. More
formally a Graph can be defined as,
A Graph is a finite set (V, E) of vertices (or nodes) and set of Edges (or arcs) which connect a pair
of nodes.
The above figure shows a graph, where the set of vertices, V = {0, 1, 2, 3, 4} and the set of edges
E = {(0, 1), (1, 2), (2, 3), (3, 4), (0, 4), (1, 4), (1, 3)}.
In the graph, V = {0, 1, 2, 3}, E = {(0, 1), (0, 2), (0, 3), (1, 2)} and graph, G = {V, E}.
Graphs are used to solve real-life problems that involve representation of the problem space as a
network. Examples of networks include city network, airline networks, telephone networks, circuit
networks, social networks (like LinkedIn, Facebook etc.). For example, a single user in Facebook
can be represented as a node (vertex) while their connection with others can be represented as an
edge between nodes.
On facebook, everything is a node. That includes User, Photo, Album, Event, Group, Page,
Comment, Story, Video, Link, Note...anything that has data is a node. Every relationship is an
1
edge from one node to another. Whether you post a photo, join a group, like a page etc., a new
edge is created for that relationship.
All of facebook is then, a collection of these nodes and edges. This is because facebook uses a
graph data structure to store its data.
2. Graph Terminologies
Vertex − each node of the graph is represented as a vertex. In the following example, the labeled
circle represents vertices. Thus, A to G are vertices.
Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to C, and so on represents edges.
2
Adjacency: A vertex is said to be adjacent to another vertex if there is an edge connecting them.
Vertices C and D are adjacent because there is an edge between them. Vertices B and E are not
adjacent because there is no edge between them.
Path − Path represents a sequence of edges between the two vertices. In the following example,
ABCD represents a path from A to D.
Outgoing edges – outgoing edges of a vertex are directed edges that the vertex is the origin.
Incoming edges - Incoming edges of a vertex are directed edges that the vertex is the destination.
Parallel edges or multiple edges are edges of the same type and end-vertices.
A directed graph is called weekly connected if replacing all of its directed edges with undirected
edges produces a connected (undirected) graph. The vertices in a weakly connected graph have
their out-degree or in-degree of at least 1.
Tree is a connected graph with no cycles. If we remove all the cycles from directed acyclic graph
(DAG) it becomes tree and if we remove any edge in a tree it becomes forest.
Spanning tree of an undirected graph is a subgraph that is a tree which includes all of the vertices
of the graph
3
The set of all neighbors of a vertex v of G = (V,E), denoted by N(v), is called the neighborhood of
v. If A is a subset of V, we denote by N(A) the set of all vertices in G that are adjacent to at least
one vertex in A. So, N(A) =Սv∈A N(v).
The degree of a vertex in an undirected graph is the number of edges incident with it, except that
a loop at a vertex contributes twice to the degree of that vertex. The degree of the vertex v is
denoted by deg(v).
In a graph with directed edges the in-degree of a vertex v, denoted by deg−(v),is the number of
edges with v as their terminal vertex. The out-degree of v, denoted by deg+(v), is the number of
edges with v as their initial vertex. (Note that a loop at a vertex contributes 1 to both the in-degree
and the out-degree of this vertex.)
Weighted graph: A positive value assigned to each edge indicating its length (distance between
the vertices connected by an edge) is called weight. The graph containing weighted edges is called
a weighted graph. The weight of an edge e is denoted by w(e) and it indicates the cost of traversing
an edge.
3. Graph Representation
Graphs are commonly represented in two ways:
Adjacency Matrix
Adjacency List
Adjacency Matrix
An adjacency matrix is 2D array of V x V vertices. Each row and column represent a vertex. If the
value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex
j. For example, the matrix shows the adjacency matrix for the given graph:
4
Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0); making the
adjacency matrix symmetric about the diagonal.
The above graph represents undirected graph with the adjacency matrix representation. It shows
adjacency matrix of undirected graph is symmetric. If there is an edge (2, 4), there is also an edge
(4, 2).
Adjacency matrix of a directed graph is never symmetric adj[i][j] = 1, indicated a directed edge
from vertex i to vertex j.
5
The above graph represents directed graph with the adjacency matrix representation. It shows
adjacency matrix of directed graph which is never symmetric. If there is an edge (2, 4), there is not
an edge (4, 2). It indicates direct edge from vertex i to vertex j.
Adjacency matrix is a way to represent a graph. It shows which nodes are adjacent to one another.
Graph is represented using a square matrix. A graph can be a sparse graph if it contains less number
of edges. It is called a dense graph if it contains more number of edges as compared to sparse graph.
Adjacency matrix is best for dense graph. Adjacency matrix of an undirected graph is always a
symmetric matrix which means an edge (i, j) implies the edge (j, i).
Edge lookup (checking if an edge exists between vertex A and vertex B) is extremely fast in
adjacency matrix representation but we have to reserve space for every possible link between all
vertices (V x V), so it requires more space.
6
Adjacency List
An adjacency list represents a graph as an array of linked list. The index of the array represents a
vertex and each element in its linked list represents the other vertices that form an edge with the
vertex. The adjacency list for the graph is as follows:
For example, consider the following directed graph representation implemented using linked list...
7
An adjacency list is efficient in terms of storage because we only need to store the values for the
edges. For a graph with millions of vertices, this can mean a lot of saved space.
In summary,
4. Graph applications
Since they are powerful abstractions, graphs can be very important in modeling data. In fact, many
problems can be reduced to known graph problems. Here we outline just some of the many
applications of graphs.
1. Social network graphs: to tweet or not to tweet. Graphs that represent who knows whom,
who communicates with whom, who influences whom or other relationships in social
structures. An example is the twitter graph of who follows whom. These can be used to
determine how information flows, how topics become hot, how communities develop, or
even who might be a good match for who, or is that whom.
2. Transportation networks. In road networks vertices are intersections and edges are the road
segments between them, and for public transportation networks vertices are stops and edges
are the links between them. Such networks are used by many map programs such as Google
maps, Bing maps and now Apple IOS 6 maps (well perhaps without the public transport)
to find the best routes between locations. They are also used for studying traffic patterns,
traffic light timings, and many aspects of transportation.
8
3. Utility graphs. The power grid, the Internet, and the water network are all examples of
graphs where vertices represent connection points, and edges the wires or pipes between
them. Analyzing properties of these graphs is very important in understanding the
reliability of such utilities under failure or attack, or in minimizing the costs to build
infrastructure that matches required demands.
4. Document link graphs. The best known example is the link graph of the web, where each
web page is a vertex, and each hyperlink a directed edge. Link graphs are used, for example,
to analyze relevance of web pages, the best sources of information, and good link sites.
5. Protein-protein interactions graphs. Vertices represent proteins and edges represent
interactionsbetweenthemthatcarryoutsomebiologicalfunctioninthecell. These graphs can
be used, for example, to study molecular pathways—chains of molecular interactions in a
cellular process. Humanshaveover120Kproteinswithmillionsofinteractionsamong them.
6. Network packet traffic graphs. Vertices are IP (Internet protocol) addresses and edges are
the packets that flow between them. Such graphs are used for analyzing network security,
studying the spread of worms, and tracking criminal or non-criminal activity.
7. Scene graphs. In graphics and computer games scene graphs represent the logical or special
relationships between objects in a scene. Such graphs are very important in the computer
games industry.
8. Finite element meshes. In engineering many simulations of physical systems, such as the
flow of air over a car or airplane wing, the spread of earthquakes through the ground, or
the structural vibrations of a building, involve partitioning space into discrete elements.
The elements along with the connections between adjacent elements forms a graph that is
called a finite element mesh.
9. Robot planning. Vertices represent states the robot can be in and the edges the possible
transitions between the states. This requires approximating continuous motion as a
sequence of discrete steps. Such graph plans are used, for example, in planning paths for
autonomous vehicles.
10. Neural networks. Vertices represent neurons and edges the synapses between them. Neural
networks are used to understand how our brain works and how connections change when
we learn. The human brain has about 1011 neurons and close to 1015 synapses.
11. Graphs in quantum field theory. Vertices represent states of a quantum system and the
edges the transitions between them. The graphs can be used to analyze path integrals and
summing these up generates a quantum amplitude (yes, I have no idea what that means).
12. Semantic networks. Vertices represent words or concepts and edges represent the
relationships among the words or concepts. These have been used in various models of
how humans organize their knowledge, and how machines might simulate such an
organization.
13. Graphs in epidemiology. Vertices represent individuals and directed edges the transfer of
an infectious disease from one individual to another. Analyzing such graphs has become
an important component in understanding and controlling the spread of diseases.
9
14. Graphs in compilers. Graphs are used extensively in compilers. They can be used for type
inference, for so called data flow analysis, register allocation and many other purposes.
They are also used in specialized compilers, such as query optimization in database
languages.
15. Constraint graphs. Graphs are often used to represent constraints among items. For
example the GSM network for cell phones consists of a collection of overlapping cells.
Any pair of cells that overlap must operate at different frequencies. These constraints can
be modeled as a graph where the cells are vertices and edges are placed between cells that
overlap.
16. Dependence graphs. Graphs can be used to represent dependences or precedences among
items. Such graphs are often used in large projects in laying out what components rely on
other components and used to minimize the total time or cost to completion while abiding
by the dependences.
17. In Computer science graphs are used to represent the flow of computation.
18. Google maps uses graphs for building transportation systems, where intersection of
two(or more) roads are considered to be a vertex and the road connecting two vertices is
considered to be an edge, thus their navigation system is based on the algorithm to
calculate the shortest path between two vertices.
19. In Facebook, users are considered to be the vertices and if they are friends then there is
an edge running between them. Facebook’s Friend suggestion algorithm uses graph
theory. Facebook is an example of undirected graph.
20. In World Wide Web, web pages are considered to be the vertices. There is an edge from
a page u to other page v if there is a link of page v on page u. This is an example of
Directed graph. It was the basic idea behind Google Page Ranking Algorithm.
21. In Operating System, we come across the Resource Allocation Graph where each
process and resources are considered to be vertices. Edges are drawn from resources to
the allocated process, or from requesting process to the requested resource. If this leads to
any formation of a cycle then a deadlock will occur.
22. Computer Science: In computer science, graph is used to represent networks of
communication, data organization, computational devices etc.
23. Physics and Chemistry: Graph theory is also used to study molecules in chemistry and
physics.
24. Social Science: Graph theory is also widely used in sociology.
25. Mathematics: In this, graphs are useful in geometry and certain parts of topology such as
knot theory.
26. Biology: Graph theory is useful in biology and conservation efforts.
5. Graphs as an ADT
Description
A graph data structure consists of a finite (and possibly mutable) set of vertices (also called nodes
or points), together with a set of unordered pairs of these vertices for an undirected graph or a set
10
of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines),
and for a directed graph are also known as arrows.
Operations
#include <iostream>
#include <vector>
using namespace std;
// Graph Constructor
Graph(vector<Edge> const &edges, int N)
{
// resize the vector to N elements of type vector<int>
adjList.resize(N);
11
for (auto &edge: edges)
{
// insert at the end
adjList[edge.src].push_back(edge.dest);
// construct graph
Graph graph(edges, N);
12
return 0;
}
#include <iostream>
using namespace std;
// stores adjacency list items
struct adjNode {
int val, cost;
adjNode* next;
};
// structure to store edges
struct graphEdge {
int start_ver, end_ver, weight;
};
class DiaGraph{
// insert new nodes into adjacency list from given graph
adjNode* getAdjListNode(int value, int weight, adjNode* head) {
adjNode* newNode = new adjNode;
newNode->val = value;
newNode->cost = weight;
13
// Destructor
~DiaGraph() {
for (int i = 0; i < N; i++)
delete[] head[i];
delete[] head;
}
};
// print all adjacent vertices of given vertex
void display_AdjList(adjNode* ptr, int i)
{
while (ptr != nullptr) {
cout << "(" << i << ", " << ptr->val
<< ", " << ptr->cost << ") ";
ptr = ptr->next;
}
cout << endl;
}
int main()
{
// graph edges array.
graphEdge edges[] = {
// (x, y, w) -> edge from x to y with weight w
{0,1,2},{0,2,4},{1,4,3},{2,3,2},{3,1,4},{4,3,3}
};
int N = 6; // Number of vertices in the graph
// calculate number of edges
int n = sizeof(edges)/sizeof(edges[0]);
// construct graph
DiaGraph diagraph(edges, n, N);
// print adjacency list representation of graph
cout<<"Graph adjacency list "<<endl<<"(start_vertex, end_vertex,
weight):"<<endl;
for (int i = 0; i < N; i++)
{
// display adjacent vertices of vertex i
display_AdjList(diagraph.head[i], i);
}
return 0;
}
6. Graphs types
Undirected Graph
14
Directed Graph
Set of Edges W = {(1, 3), (1, 5), (2, 1), (2, 3), (2, 4), (3, 4), (4, 5)}
15
Finite Graphs
A graph is said to be finite if it has finite number of vertices and finite number of edges.
Infinite Graph
A graph is said to be infinite if it has infinite number of vertices as well as infinite number of
edges.
Trivial Graph
A graph is said to be trivial if a finite graph contains only one vertex and no edge.
Simple Graph
A simple graph is a graph which does not contains more than one edge between the pair of vertices.
A simple railway tracks connecting different cities is an example of simple graph.
16
Multi Graph
Any graph which contain some parallel edges but doesn’t contain any self-loop is called multi
graph. For example A Road Map.
Parallel Edges: If two vertices are connected with more than one edge than such
edges are called parallel edges that is many roots but one destination.
Loop: An edge of a graph which join a vertex to itself is called loop or a self-loop.
Null Graph
A graph of order n and size zero that is a graph which contain n number of vertices but do not
contain any edge.
17
Complete Graph
A simple graph with n vertices is called a complete graph if the degree of each vertex is n-1, that
is, one vertex is attach with n-1 edges. A complete graph is also called Full Graph.
Pseudo Graph
A graph G with a self-loop and some multiple edges is called pseudo graph.
Regular Graph
A simple graph is said to be regular if all vertices of a graph G are of equal degree. All complete
graphs are regular but vice versa is not possible.
18
Bipartite Graph
A graph G = (V, E) is said to be bipartite graph if its vertex set V(G) can be partitioned into two
non-empty disjoint subsets. V1(G) and V2(G) in such a way that each edge e of E(G) has its one
end in V1(G) and other end in V2(G).
V2(G)={V1, V2}
Labelled Graph
If the vertices and edges of a graph are labelled with name, data or weight then it is called labelled
graph. It is also called Weighted Graph.
19
Digraph Graph
A graph G = (V, E) with a mapping f such that every edge maps onto some ordered pair of vertices
(Vi, Vj) is called Digraph. It is also called Directed Graph. Ordered pair (Vi, Vj) means an edge
between Vi and Vj with an arrow directed from Vi to Vj.
e1 = (V1, V2)
e2 = (V2, V3)
e4 = (V2, V4)
Subgraph
A graph G = (V1, E1) is called subgraph of a graph G(V, E) if V1(G) is a subset of V(G) and
E1(G) is a subset of E(G) such that each edge of G1 has same end vertices as in G.
20
Types of Subgraph:
Vertex disjoint subgraph: Any two graph G1 = (V1, E1) and G2 = (V2, E2) are said to
be vertex disjoint of a graph G = (V, E) if V1(G1) intersection V2(G2) = null. In figure
there is no common vertex between G1 and G2.
Edge disjoint subgraph: A subgraph is said to be edge disjoint if E1(G1) intersection
E2(G2) = null. In figure there is no common edge between G1 and G2.
Note: Edge disjoint subgraph may have vertices in common but vertex disjoint graph cannot have
common edge, so vertex disjoint subgraph will always be an edge disjoint subgraph.
A graph G is said to be connected if for any pair of vertices (Vi, Vj) of a graph G are reachable
from one another. Or a graph is said to be connected if there exist at least one path between each
and every pair of vertices in graph G, otherwise it is disconnected. A null graph with n vertices is
disconnected graph consisting of n components. Each component consist of one vertex and no
edge.
21
Cyclic Graph
A graph G consisting of n vertices and n> = 3 that is V1, V2, V3- – – – – – – – Vn and edges (V1,
V2), (V2, V3), (V3, V4)- – – – – – – – — -(Vn, V1) are called cyclic graph.
Acyclic Graph
An acyclic graph is a graph without cycles (a cycle is a complete circuit). When following the
graph from node to node, you will never visit the same node twice.
This graph (the thick black line) is acyclic, as it has no cycles (complete circuits). A connected
acyclic graph, like the one above, is called a tree. If one or more of the tree “branches” is
disconnected, the acyclic graph is a called a forest.
22
Directed Acyclic Graph
A directed acyclic graph is an acyclic graph that has a direction as well as a lack of cycles.
A directed acyclic graph has a topological ordering. This means that the nodes are ordered so that
the starting node has a lower value than the ending node. A DAG has a unique topological ordering
if it has a directed path containing all the nodes; in this case the ordering is the same as the order
in which the nodes appear in the path.
In computer science, DAGs are also called wait-for-graphs. When a DAG is used to detect a
deadlock, it illustrates that a resources has to wait for another process to continue. Directed acyclic
graphs (DAGs) are used to model probabilities, connectivity, and causality. A “graph” in this sense
means a structure made from nodes and edges.
Transitive closure
The transitive closure of a directed graph with n vertices can be defined as the n-by-n boolean
matrix T, in which the element in the ith row and jth column is 1 if there exist a directed path from
the ith vertex to the jth vertex, otherwise it is zero.
23
Fig: (a) Digraph G (b) Adjacency matrix of G (c) Transitive closure
Warshall's Algorithm
Warshall's algorithm uses the adjacency matrix to find the transitive closure of a directed graph.
The definition of the element at the ith row and jth column in the kth matrix (R(k)), rij(k) is one if
and only if there exist a path from vi to vj such that all the intermediate vertex, wq is among the
first k vertices, ie. 1 ≤ q ≤ k.
The R(0) matrix represent paths without any intermediate vertices, so it is the adjacency matrix.
24
The R(n) matrix has ones if there is a path between the vertices with intermediate vertices from any
of the n vertices of the graph, so it is the transitive closure.
Consider the case rij(k) is one and rij(k-1) = 0. This can occur only if that there is an intermediate path
through vk from from vi to vj. More specifically the list of vertices has the form
In sumarry
That is-
The worst case cost is Θ(n3), so it is not better than the brute force algorithm.
25
Figure: Rule for changing zeros to ones in Warshall’s algorithm
26
8. Graphs traversal
Many graph applications need to visit the vertices of a graph in some specific order based on the
graph's topology. This is known as a graph traversal and is similar in concept to a tree traversal.
Recall that tree traversals visit every node exactly once, in some specified order such as preorder,
inorder, or postorder. Multiple tree traversals exist because various applications require the nodes
to be visited in a particular order. For example, to print a BST's nodes in ascending order requires
an inorder traversal as opposed to some other traversal. Standard graph traversal orders also exist.
Each is appropriate for solving certain problems. For example, many problems in artificial
intelligence programming are modeled using graphs. The problem domain might consist of a large
collection of states, with connections between various pairs of states. Solving this sort of problem
requires getting from a specified start state to a specified goal state by moving between states only
through the connections. Typically, the start and goal states are not directly connected. To solve
this problem, the vertices of the graph must be searched in some organized manner.
Graph traversal algorithms typically begin with a start vertex and attempt to visit the remaining
vertices from there. Graph traversals must deal with a number of troublesome cases. First, it might
27
not be possible to reach all vertices from the start vertex. This occurs when the graph is not
connected. Second, the graph might contain cycles, and we must make sure that cycles do not
cause the algorithm to go into an infinite loop.
Graph traversal algorithms can solve both of these problems by flagging vertices as VISITED
when appropriate. At the beginning of the algorithm, no vertex is flagged as VISITED. The flag
for a vertex is set when the vertex is first visited during the traversal. If a flagged vertex is
encountered during traversal, it is not visited a second time. This keeps the program from going
into an infinite loop when it encounters a cycle.
Once the traversal algorithm completes, we can check to see if all vertices have been processed by
checking whether they have the VISITED flag set. If not all vertices are flagged, we can continue
the traversal from another unvisited vertex. Note that this process works regardless of whether the
graph is directed or undirected.
Using graph traversal we visit all the vertices of the graph without getting into looping path. There
are two graph traversal techniques and they are as follows...
BFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph
without loops. We use Queue data structure with maximum size of total number of vertices in
the graph to implement BFS traversal.
BFS examines all vertices connected to the start vertex before visiting vertices further away. BFS
is implemented similarly to DFS, except that a queue replaces the recursion stack. Note that if the
graph is a tree and the start vertex is at the root, BFS is equivalent to visiting vertices level by level
from top to bottom.
28
Step 5 - Repeat steps 3 and 4 until queue becomes empty.
Step 6 - When queue becomes empty, then produce final spanning tree by removing unused
edges from the graph
Example:
29
30
Program to print BFS traversal from a given
31
{
int V; // No. of vertices
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::BFS(int s)
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
for(int i = 0; i < V; i++)
visited[i] = false;
while(!queue.empty())
{
32
// Dequeue a vertex from queue and print it
s = queue.front();
cout << s << " ";
queue.pop_front();
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
DFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph
without loops.
We use Stack data structure with maximum size of total number of vertices in the graph to
implement DFS traversal.
33
Whenever a vertex v is visited during the search, DFS will recursively visit all of v 's unvisited
neighbors.
Equivalently, DFS will add all edges leading out of v to a stack. The next vertex to be visited is
determined by popping the stack and following that edge.
The effect is to follow one branch through the graph to its conclusion, then it will back up and
follow another branch, and so on. The DFS process can be used to define a depth-first search tree.
This tree is composed of the edges that were followed to any new (unvisited) vertex during the
traversal, and leaves out the edges that lead to already visited vertices. DFS can be applied to
directed or undirected graphs.
Example
34
35
36
37
38
C++ program to print DFS traversal from
39
// Graph class represents a directed graph
// using adjacency list representation
class Graph
{
int V; // No. of vertices
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
40
DFSUtil(*i, visited);
}
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
return 0;
}
The total number of spanning trees with n vertices that can be created from a complete graph is
equal to n (n-2).
41
Example of a Spanning Tree
Let's understand the spanning tree with examples below. Let a graph:
The spanning trees that can be created from the above graph are:
We now understand that one graph can have more than one spanning tree. Following are a few
properties of the spanning tree connected to graph G −
1. Spanning tree has n-1 edges, where n is the number of nodes (vertices).
2. From a complete graph, by removing maximum e - n + 1 edges, we can construct a
spanning tree.
3. A complete graph can have maximum nn-2 number of spanning trees.
42
Thus, we can conclude that spanning trees are a subset of connected Graph G and disconnected
graphs do not have spanning tree.
A minimum spanning tree is a spanning tree in which the sum of the weight of the edges is as
minimum as possible.
The minimal-cost spanning tree (MCST) problem takes as input a connected, undirected graph GG,
where each edge has a distance or weight measure attached. The MCST is the graph containing
the vertices of G along with the subset of G 's edges that (1) has minimum total cost as measured
by summing the values for all of the edges in the subset, and (2) keeps the vertices connected.
Applications where a solution to this problem is useful include soldering the shortest set of wires
needed to connect a set of terminals on a circuit board, and connecting a set of cities by telephone
lines in such a way as to require the least amount of cable.
The MCST contains no cycles. If a proposed MCST did have a cycle, a cheaper MCST could be
had by removing any one of the edges in the cycle. Thus, the MCST is a free tree with |V|−1 edges.
The name "minimum-cost spanning tree" comes from the fact that the required set of edges forms
a tree, it spans the vertices (i.e., it connects them together), and it has minimum cost.
43
The minimum spanning tree from the above spanning trees is:
The minimum spanning tree from a graph is found using the following algorithms:
1. Prim's Algorithm
2. Kruskal's Algorithm
Prim's Algorithm
Prim's algorithm is used to find a minimum cost spanning tree. This algorithm takes a graph as
input and finds the subset of the edges of that graph which
44
Prim's algorithm to find minimum cost spanning tree uses the greedy approach which finds the
local optimum in the hopes of finding a global optimum.
Prim's algorithm starts from one vertex and keep adding edges with the lowest weight until we we
reach our goal.
Prim's algorithm is very simple. Start with any Vertex NN in the graph, setting the MCST to be N
initially. Pick the least-cost edge connected to N. This edge connects N to another vertex; call this
M. Add Vertex M and Edge (N,M) to the MCST. Next, pick the least-cost edge coming from either
N or M to any other vertex in the graph. Add this edge and the new vertex it reaches to the MCST.
This process continues, at each step expanding the MCST by selecting the least-cost edge from a
vertex currently in the MCST to a vertex not currently in the MCST.
Prim's algorithm is quite similar to Dijkstra's algorithm for finding the single-source shortest paths.
The primary difference is that we are seeking not the next closest vertex to the start vertex, but
rather the next closest vertex to any vertex currently in the MCST.
45
A C++ program for Prim's Minimum
46
// Initialize min value
int min = INT_MAX, min_index;
return min_index;
}
47
// set of vertices not yet included in MST
int u = minKey(key, mstSet);
int main()
{
/* Let us create the following graph
2 3
(0)--(1)--(2)
| / \ |
6| 8/ \5 |7
| / \ |
(3)-------(4)
9 */
int graph[V][V] = { { 0, 2, 0, 6, 0 },
{ 2, 0, 3, 8, 5 },
{ 0, 3, 0, 0, 7 },
{ 6, 8, 0, 0, 9 },
{ 0, 5, 7, 9, 0 } };
return 0;
48
}
Kruskal's algorithm is another popular minimum spanning tree algorithm that uses a different logic
to find the MST of a graph. Instead of starting from an vertex, Kruskal's algorithm sorts all the
edges from low weight to high and keeps adding the lowest edges, ignoring those edges that create
a cycle.
Kruskal's Algorithm
Kruskal's algorithm is a minimum spanning tree algorithm that takes a graph as input and finds the
subset of the edges of that graph which
It is a greedy algorithms which find the local optimum in the hopes of finding a global optimum.
This algorithm starts from an edge with the lowest weight and keep adding edges until a minimum
spanning tree is built.
49
Kruskal's algorithm is also a simple, greedy algorithm. First partition the set of vertices into |V|
disjoint sets, each consisting of one vertex. Then process the edges in order of weight. An edge is
added to the MCST, and two disjoint sets combined, if the edge connects two vertices in different
disjoint sets. This process is repeated until only one disjoint set remains.
The edges can be processed in order of weight by using a min-heap. This is generally faster than
sorting the edges first, because in practice we need only visit a small fraction of the edges before
completing the MCST.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
50
#define edge pair<int,int>
class Graph {
private:
vector<pair<int, edge>> G; // graph
vector<pair<int, edge>> T; // mst
int *parent;
int V; // number of vertices/nodes in graph
public:
Graph(int V);
void AddWeightedEdge(int u, int v, int w);
int find_set(int i);
void union_set(int u, int v);
void kruskal();
void print();
};
Graph::Graph(int V) {
parent = new int[V];
//i 0 1 2 3 4 5
//parent[i] 0 1 2 3 4 5
for (int i = 0; i < V; i++)
parent[i] = i;
G.clear();
T.clear();
}
void Graph::AddWeightedEdge(int u, int v, int w) {
G.push_back(make_pair(w, edge(u, v)));
}
int Graph::find_set(int i) {
// If i is the parent of itself
if (i == parent[i])
return i;
else
// Else if i is not the parent of itself
// Then i is not the representative of his set,
// so we recursively call Find on its parent
return find_set(parent[i]);
}
51
T.push_back(G[i]); // add to tree
union_set(uRep, vRep);
}
}
}
void Graph::print() {
cout << "Edge :" << " Weight" << endl;
for (int i = 0; i < T.size(); i++) {
cout << T[i].second.first << " - " << T[i].second.second << ":"
<< T[i].first;
cout << endl;
}
}
int main() {
Graph g(6);
g.AddWeightedEdge(0, 1, 4);
g.AddWeightedEdge(0, 2, 4);
g.AddWeightedEdge(1, 2, 2);
g.AddWeightedEdge(1, 0, 4);
g.AddWeightedEdge(2, 0, 4);
g.AddWeightedEdge(2, 1, 2);
g.AddWeightedEdge(2, 3, 3);
g.AddWeightedEdge(2, 5, 2);
g.AddWeightedEdge(2, 4, 4);
g.AddWeightedEdge(3, 2, 3);
g.AddWeightedEdge(3, 4, 3);
g.AddWeightedEdge(4, 2, 4);
g.AddWeightedEdge(4, 3, 3);
g.AddWeightedEdge(5, 2, 2);
g.AddWeightedEdge(5, 4, 3);
g.kruskal();
g.print();
return 0;
}
Prim's algorithm is another popular minimum spanning tree algorithm that uses a different logic
to find the MST of a graph. Instead of starting from an edge, Prim's algorithm starts from a vertex
and keeps adding lowest-weight edges which aren't in the tree, until all vertices have been covered.
Both Prim’s and Kruskal’s algorithm finds the Minimum Spanning Tree and follow the Greedy
approach of problem-solving, but there are few major differences between them.
52
It starts to build the Minimum It starts to build the Minimum Spanning Tree from the
Spanning Tree from any vertex in the vertex carrying minimum weight in the graph.
graph.
It traverses one node more than one It traverses one node only once.
time to get the minimum distance.
Prim’s algorithm has a time Kruskal’s algorithm’s time complexity is O(logV), V
complexity of O(V^2), V being the being the number of vertices.
number of vertices.
Prim’s algorithm gives connected Kruskal’s algorithm can generate forest(disconnected
component as well as it works only components) at any instant as well as it can work on
on connected graph. disconnected components
Prim’s algorithm runs faster in dense Kruskal’s algorithm runs faster in sparse graphs.
graphs.
Shortest-Paths Problems
On a road map, a road connecting two towns is typically labeled with its distance. We can model
a road network as a directed graph whose edges are labeled with real numbers. These numbers
represent the distance (or other cost metric, such as travel time) between two vertices. These labels
may be called weights, costs, or distances, depending on the application. Given such a graph, a
typical problem is to find the total length of the shortest path between two specified vertices. This
is not a trivial problem, because the shortest path may not be along the edge (if any) connecting
two vertices, but rather may be along a path involving one or more intermediate vertices.
For example, in Figure 14.5.1, the cost of the path from A to B to D is 15. The cost of the edge
directly from A to D is 20. The cost of the path from A to C to B to D is 10. Thus, the shortest
path from A to D is 10 (rather than along the edge connecting A to D). We use the notation
d(A,D)=10 to indicate that the shortest distance from A to D is 10. In Figure 14.5.1, there is no
path from E to B, so we set d(E,B)=∞. We define w(A,D)=20 to be the weight of edge (A,D), that
is, the weight of the direct connection from A to D. Because there is no edge from E to B
,w(E,B)=∞. Note that w(D,A)=∞ because the graph of Figure 14.5.1 is directed. We assume that
all weights are positive.
53
Figure 14.5.1: Example graph for shortest-path definitions.
We will now present an algorithm to solve the single-source shortest paths problem. Given Vertex
S in Graph G, find a shortest path from S to every other vertex in G. We might want only the
shortest path between two vertices, SS and T. However in the worst case, finding the shortest path
from S to T requires us to find the shortest paths from S to every other vertex as well. So there is
no better algorithm (in the worst case) for finding the shortest path to a single vertex than to find
shortest paths to all vertices. The algorithm described here will only compute the distance to every
such vertex, rather than recording the actual path.
Computer networks provide an application for the single-source shortest-paths problem. The goal
is to find the cheapest way for one computer to broadcast a message to all other computers on the
network. The network can be modeled by a graph with edge weights indicating time or cost to send
a message to a neighboring computer.
Dijkstra's algorithm
Dijkstra's algorithm finds the shortest path between any two vertices of a graph. It is a solution to
the single-source shortest path problem in graph theory
54
(Edsger Wybe Dijkstra- May 11, 1930 – August 6, 2002 ter scientist from
Netherlands ious
award in computer science )
Dijkstra’s algorithm finds a shortest path tree from a single source node, by building a set of nodes
that have minimum distance from the source.
Dijkstra's Algorithm works on the basis that any subpath B -> D of the shortest path A -> D
between vertices A and D is also the shortest path between vertices B and D.
Djikstra used this property in the opposite direction i.e we overestimate the distance of each vertex
from the starting vertex. Then we visit each node and its neighbours to find the shortest subpath to
those neighbours.
The algorithm uses a greedy approach in the sense that we find the next best solution hoping that
the end result is the best solution for the whole problem.
1. Mark your selected initial node with a current distance of 0 and the rest with infinity.
2. Set the non-visited node with the smallest current distance as the current node C .
3. For each neighbour N of your current node C : add the current distance of C with the weight
of the edge connecting C -> N . If it's smaller than the current distance of N , set it as the new
current distance of N .
4. Mark the current node C as visited.
5. If there are non-visited nodes, go to step 2.
55
Example of Dijkstra's algorithm
56
Time Complexity of Dijkstra's Algorithm is O (V2) but with min-priority queue it drops down to
O (V + E log V).
1. Set all vertices distances = infinity except for the source vertex, set the source distance =
0.
2. Push the source vertex in a min-priority queue in the form (distance , vertex), as the
comparison in the min-priority queue will be according to vertices distances.
3. Pop the vertex with the minimum distance from the priority queue (at first the popped
vertex = source).
4. Update the distances of the connected vertices to the popped vertex in case of "current
vertex distance + edge weight < next vertex distance", then push the vertex
with the new distance to the priority queue.
5. If the popped vertex is visited before, just continue without using it.
6. Apply the same algorithm again until the priority queue is empty.
Implementing Algorithm
It works by maintaining a distance estimate D(X) for all vertices X in V. The elements of D are
initialized to the value INFINITE. Vertices are processed in order of distance from S. Whenever
a vertex v is processed, D(X) is updated for every neighbor X of V. Here is an implementation for
Dijkstra's algorithm. At the end, array D will contain the shortest distance values.
57
D[i] = INFINITY;
D[s] = 0;
for (int i=0; i<G.nodeCount(); i++) { // Process the vertices
int v = minVertex(G, D); // Find next-closest vertex
G.setValue(v, VISITED);
if (D[v] == INFINITY) return; // Unreachable
int[] nList = G.neighbors(v);
for (int j=0; j<nList.length; j++) {
int w = nList[j];
if (D[w] > (D[v] + G.weight(v, w)))
D[w] = D[v] + G.weight(v, w);
}
}
}
We next consider the problem of finding the shortest distance between all pairs of vertices in the graph,
called the all-pairs shortest paths problem.
Floyd–Warshall's Algorithm
Floyd–Warshall's Algorithm is used to find the shortest paths between all pairs of vertices in a
graph, where each edge in the graph has a weight which is positive or negative. The biggest
advantage of using this algorithm is that all the shortest distances between any 2 vertices could be
calculated in O (V3), where V is the number of vertices in a graph.
The dist[i][k] represents the shortest path that only uses the first K vertices, dist[k][j] represents
the shortest path between the pair k,j. As the shortest path will be a concatenation of the shortest
path from i to k, then from k to j.
58
for(int k = 1; k <= n; k++){
for(int i = 1; i <= n; i++){
for(int j = 1; j <= n; j++){
dist[i][j] = min( dist[i][j], dist[i][k] + dist[k][j] );
}
}
}
Time Complexity of Floyd–Warshall's Algorithm is O(V3), where V is the number of vertices in a graph.
Example
Consider a graph as
Follow the steps below to find the shortest path between all the pairs of vertices.
1. Create a matrix A1 of dimension n*n where n is the number of vertices. The row and the
column are indexed as i and j respectively. i and j are the vertices of the graph.
Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex. If there is
no path from ith vertex to jth vertex, the cell is left as infinity.
2. Now, create a matrix A1 using matrix A0. The elements in the first column and the first row
are left as they are. The remaining cells are filled in the following way. Let k be the
intermediate vertex in the shortest path from source to destination. In this step, k is the first
vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).That is, if the
59
direct distance from the source to the destination is greater than the path through the vertex
k, then the cell is filled with A[i][k] + A[k][j].
For example: For A1[2, 4], the direct distance from vertex 2 to 4 is 4 and the sum of the
distance from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to 4) is
7. Since 4 < 7, A0[2, 4] is filled with 4.
3. In a similar way, A2 is created using A3. The elements in the second column and the second
row are left as they are. In this step, k is the second vertex. The remaining steps are the
same as in step 2.
60
5. A4 gives the shortest path between each pair of vertices.
Floyd-Warshall’s Algorithm
n = no of vertices
A = matrix of dimension n*n
for k = 1 to n
for i = 1 to n
for j = 1 to n
Ak[i, j] = min (Ak-1[i, j], Ak-1[i, k] + Ak-1[k,
j])
return A
61