Unit-5 Greedy Algorithm
Unit-5 Greedy Algorithm
Unit-5 Greedy Algorithm
"Greedy Method finds out of many options, but you have to choose the best option."
In this method, we have to find out the best method/option out of many present ways.
In this approach/method we focus on the first stage and decide the output, don't think about the future.
Greedy Algorithm solves problems by making the best choice that seems best at the particular moment. Many
optimization problems can be determined using a greedy algorithm. Some issues have no efficient solution,
but a greedy algorithm may provide a solution that is close to optimal. A greedy algorithm works if a problem
exhibits the following two properties:
1. Greedy Choice Property: A globally optimal solution can be reached at by creating a locally optimal
solution. In other words, an optimal solution can be obtained by creating "greedy" choices.
2. Optimal substructure: Optimal solutions contain optimal subsolutions. In other words, answers to
subproblems of an optimal solution are optimal.
Example:
1. machine scheduling
4. Huffman Code
5. Job Sequencing
3. Unalterable: Once the decision is made, at any subsequence step that option is not altered.
Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources which can be used by
only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each Activity "i" has start time si and a
finish time fi, where si ≤fi. If selected activity "i" take place meanwhile the half-open time interval [si,fi).
Activities i and j are compatible if the intervals (si, fi) and [si, fi) do not overlap (i.e. i and j are compatible if si
≥fi or si ≥fi). The activity-selection problem chosen the maximum- size set of mutually consistent activities.
Example: Given 10 activities along with their start and end time as
S = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)
Si = (1,2,3,4,7,8,9,9,11,12)
fi = (3,5,4,7,10,9,11,13,12,14)
Compute a schedule where the greatest number of activities takes place.
Solution: The solution to the above Activity scheduling problem using a greedy strategy is illustrated below:
Next, schedule A4 as A1 A3 and A4 are non-interfering, then next, schedule A6 as A1 A3 A4 and A6 are
non-interfering.
Skip A5 as it is interfering.
Next, schedule A7 as A1 A3 A4 A6 and A7 are non-interfering.
Next, schedule A9 as A1 A3 A4 A6 A7 and A9 are non-interfering.
Next, schedule A10 as A1 A3 A4 A6 A7 A9 and A10 are non-interfering.
Prim’s Algorithm-
Step-01:
● Find all the edges that connect the tree to new vertices.
● Find the least weight edge among those edges and include it in the existing tree.
● If including that edge creates a cycle, then reject that edge and look for the next least
weight edge.
Step-03:
● Keep repeating step-02 until all the vertices are included and Minimum Spanning
Tree (MST) is obtained.
● If adjacency list is used to represent the graph, then using breadth first search, all
the vertices can be traversed in O(V + E) time.
● We traverse all the vertices of graph using breadth first search and use a min
heap for storing the vertices not yet included in the MST.
● To get the minimum weight edge, we use min heap as a priority queue.
● Min heap operations like extracting minimum element and decreasing key value
takes O(logV) time.
= O(E + V) x O(logV)
= O((E + V)logV)
= O(ElogV)
This time complexity can be improved and reduced to O(E + VlogV) using Fibonacci heap.
Problem-01:
Construct the minimum spanning tree (MST) for the given graph using Prim’s Algorithm-
Solution-
The above discussed steps are followed to find the minimum cost spanning tree using Prim’s
Algorithm-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Since all the vertices have been included in the MST, so we stop.
= 10 + 25 + 22 + 12 + 16 + 14
= 99 units
Problem-02:
Using Prim’s Algorithm, find the cost of minimum spanning tree (MST) of the given graph-
Solution-
The minimum spanning tree obtained by the application of Prim’s Algorithm on the given
graph is as shown below-
Now, Cost of Minimum Spanning Tree
= 1 + 4 + 2 + 6 + 3 + 10
= 26 units
Kruskal’s Algorithm-
Step-01:
Step-02:
● Take the edge with the lowest weight and use it to connect the vertices of graph.
● If adding an edge creates a cycle, then reject that edge and go for the next least
weight edge.
Step-03:
● Keep adding edges until all the vertices are connected and a Minimum Spanning Tree
(MST) is obtained.
= O(ElogV) or O(ElogE)
Analysis-
Special Case-
● If the edges are already sorted, then there is no need to construct min heap.
● So, deletion from min heap time is saved.
● In this case, time complexity of Kruskal’s Algorithm = O(E + V)
Problem-01:
Construct the minimum spanning tree (MST) for the given graph using Kruskal’s Algorithm-
Solution-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
Since all the vertices have been connected / included in the MST, so we stop.
= 10 + 25 + 22 + 12 + 16 + 14
= 99 units
We have discussed-
Concept-01:
If all the edge weights are distinct, then both the algorithms are guaranteed to find the same
MST.
Example-
Here, both the algorithms on the above given graph produces the same MST as shown.
Concept-02:
● If all the edge weights are not distinct, then both the algorithms may not always
produce the same MST.
● However, cost of both the MSTs would always be same in both the cases.
Example-
Concept-03:
Concept-04:
The tree that we are making or growing The tree that we are making or growing
always remains connected. usually remains disconnected.
Prim’s Algorithm grows a solution from a Kruskal’s Algorithm grows a solution from
random vertex by adding the next cheapest the cheapest edge by adding the next
vertex to the existing tree. cheapest edge to the existing tree.
Prim’s Algorithm is faster for dense graphs. Kruskal’s Algorithm is faster for sparse
graphs.
● Shortest path problem is a problem of finding the shortest path(s) between vertices of
a given graph.
● Shortest path between two vertices is a path that has the least cost as compared to all
other existing paths.
Shortest path algorithms are a family of algorithms used for solving the shortest path problem.
Applications-
● Google Maps
● Road Networks
● Logistics Research
● It is a shortest path problem where the shortest path between a given pair of vertices
is computed.
● A* Search Algorithm is a famous algorithm used for solving single-pair shortest path
problem.
● It is a shortest path problem where the shortest path from a given source vertex to all
other remaining vertices is computed.
● Dijkstra’s Algorithm and Bellman Ford Algorithm are the famous algorithms used
for solving single-source shortest path problem.
● It is a shortest path problem where the shortest path from all the vertices to a single
destination vertex is computed.
● By reversing the direction of each edge in the graph, this problem reduces to
single-source shortest path problem.
● Dijkstra’s Algorithm is a famous algorithm adapted for solving single-destination
shortest path problem.
All Pairs Shortest Path Problem-
● It is a shortest path problem where the shortest path between every pair of vertices is
computed.
● Floyd-Warshall Algorithm and Johnson’s Algorithm are the famous algorithms used
for solving All pairs shortest path problem.
Dijkstra Algorithm-
Conditions-
Step-01:
● One set contains all those vertices which have been included in the shortest path tree.
● In the beginning, this set is empty.
● Other set contains all those vertices which are still left to be included in the shortest
path tree.
● In the beginning, this set contains all the vertices of the given graph.
Step-02:
For each vertex of the given graph, two variables are defined as-
● The value of variable ‘Π’ for each vertex is set to NIL i.e. Π[v] = NIL
● The value of variable ‘d’ for source vertex is set to 0 i.e. d[S] = 0
● The value of variable ‘d’ for remaining vertices is set to ∞ i.e. d[v] = ∞
Step-03:
The following procedure is repeated until all the vertices of the graph are processed-
● Among unprocessed vertices, a vertex with minimum value of variable ‘d’ is chosen.
● Its outgoing edges are relaxed.
● After relaxing the edges for that vertex, the sets created in step-01 are updated.
Here, d[a] and d[b] denotes the shortest path estimate for vertices a and b respectively from the
source vertex ‘S’.
Now,
Case-01:
Here,
Case-02:
Here,
● With adjacency list representation, all vertices of the graph can be traversed using
BFS in O(V+E) time.
● In min heap, operations like extract-min and decrease-key value takes O(logV) time.
● So, overall time complexity becomes O(E+V) x O(logV) which is O((E + V) x logV) =
O(ElogV)
● This time complexity can be reduced to O(E+VlogV) using Fibonacci heap.
Problem-
Using Dijkstra’s Algorithm, find the shortest distance from source vertex ‘S’ to remaining
vertices in the following graph-
Also, write the order in which the vertices are visited.
Solution-
Step-01:
● Unvisited set : {S , a , b , c , d , e}
● Visited set :{}
Step-02:
The two variables Π and d are created for each vertex and initialized as-
Step-03:
Now,
● d[S] + 1 = 0 + 1 = 1 < ∞
● d[S] + 5 = 0 + 5 = 5 < ∞
● Unvisited set : {a , b , c , d , e}
● Visited set : {S}
Step-04:
● d[a] + 2 = 1 + 2 = 3 < ∞
● d[a] + 1 = 1 + 1 = 2 < ∞
● d[b] + 2 = 1 + 2 = 3 < 5
● Unvisited set : {b , c , d , e}
● Visited set : {S , a}
Step-05:
● d[d] + 2 = 2 + 2 = 4 < ∞
● Unvisited set : {b , c , e}
● Visited set : {S , a , d}
Step-06:
Now,
● d[b] + 2 = 3 + 2 = 5 > 2
∴ No change
After edge relaxation, our shortest path tree remains the same as in Step-05.
Now, the sets are updated as-
● Unvisited set : {c , e}
● Visited set : {S , a , d , b}
Step-07:
Now,
● d[c] + 1 = 3 + 1 = 4 = 4
∴ No change
After edge relaxation, our shortest path tree remains the same as in Step-05.
Step-08:
● Unvisited set : { }
● Visited set : {S , a , d , b , c , e}
Now,
● The value or profit obtained by putting the items into the knapsack is maximum.
● And the weight limit of the knapsack does not exceed.
Knapsack Problem Variants-
Fractional knapsack problem is solved using greedy method in the following steps-
Step-01:
Step-02:
Arrange all the items in decreasing order of their value / weight ratio.
Step-03:
Start putting the items into the knapsack beginning from the item with the highest ratio.
Put as many items as you can into the knapsack.
Time Complexity-
● The main time taking step is the sorting of all items in decreasing order of their value
/ weight ratio.
● If the items are already arranged in the required order, then while loop takes O(n)
time.
● The average time complexity of Quick Sort is O(nlogn).
● Therefore, total time taken including the sort is O(nlogn).
Problem-
For the given set of items and knapsack capacity = 60 kg, find the optimal solution for the
fractional knapsack problem making use of greedy approach.
1 5 30
2 10 40
3 15 45
4 22 77
5 25 90
OR
Find the optimal solution for the fractional knapsack problem making use of greedy approach.
Consider-
n=5
w = 60 kg
(w1, w2, w3, w4, w5) = (5, 10, 15, 22, 25)
(b1, b2, b3, b4, b5) = (30, 40, 45, 77, 90)
OR
A thief enters a house for robbing it. He can carry a maximal weight of 60 kg into his bag.
There are 5 items in the house with the following weights and values. What items should thief
take if he can even take the fraction of any item with him?
2 10 40
3 15 45
4 22 77
5 25 90
Solution-
Step-01:
1 5 30 6
2 10 40 4
3 15 45 3
4 22 77 3.5
5 25 90 3.6
Step-02:
Sort all the items in decreasing order of their value / weight ratio-
I1 I2 I5 I4 I3
Step-03:
Start filling the knapsack by putting the items into it one by one.
60 Ø 0
55 I1 30
45 I1, I2 70
Now,
= 160 + (20/22) x 77
= 160 + 70
= 230 units
Important Note-
Had the problem been a 0/1 knapsack problem, knapsack would contain the following items-
< I1 , I2 , I5 >
The sequencing of jobs on a single processor with deadline constraints is called as Job
Sequencing with Deadlines.
Here-
“How can the total profit be maximized if only one job can be completed at a time?”
Approach to Solution-
● A feasible solution would be a subset of jobs where each job of the subset gets
completed within its deadline.
● Value of the feasible solution would be the sum of profit of all the jobs contained in
the subset.
● An optimal solution of the problem would be a feasible solution which gives the
maximum profit.
Greedy Algorithm-
Greedy Algorithm is adopted to determine how the next job is selected for an optimal solution.
The greedy algorithm described below always gives an optimal solution to the job sequencing
problem-
Step-01:
● Sort all the given jobs in decreasing order of their profit.
Step-02:
Step-03:
Problem-
Jobs J1 J2 J3 J4 J5 J6
Deadlines 5 3 3 2 4 2
Solution-
Step-01:
Jobs J4 J1 J3 J2 J5 J6
Deadlines 2 5 3 3 4 2
Step-02:
Now,
● We take each job one by one in the order they appear in Step-01.
● We place the job on Gantt chart as far as possible from 0.
Step-03:
Step-04:
Step-05:
Step-06:
Now,
Part-01:
This is the required order in which the jobs must be completed in order to obtain the maximum
profit.
Part-02:
Part-03:
= Profit of job J2 + Profit of job J4 + Profit of job J3 + Profit of job J5 + Profit of job J1
= 990 units
Huffman Coding-
Prefix Rule-
Huffman Tree-
Step-02:
Step-03:
Step-04:
● Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
● The tree finally obtained is the desired Huffman Tree.
Time Complexity-
Important Formulas-
The following 2 formulas are important to solve the problems based on Huffman Coding-
Formula-01:
Formula-02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
Problem-
A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-
Characters Frequencies
a 10
e 15
i 12
o 3
u 4
s 13
t 1
Solution-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
Now,
Rule
● If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
● If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
● Any of the above two conventions may be followed.
● But follow the same convention at the time of decoding that is adopted at the time
of encoding.
After assigning weight to all the edges, the modified Huffman Tree is-
Now, let us answer each part of the given problem one by one-
To write Huffman Code for any character, traverse the Huffman Tree from root node to the
leaf node of that character.
Following this rule, the Huffman Code for each character is-
● a = 111
● e = 10
● i = 00
● o = 11001
● u = 1101
● s = 01
● t = 11000
● Characters occurring less frequently in the text are assigned the larger code.
● Characters occurring more frequently in the text are assigned the smaller code.
= 2.52
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits