Unit-5 Greedy Algorithm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Greedy Algorithm Introduction

"Greedy Method finds out of many options, but you have to choose the best option."

In this method, we have to find out the best method/option out of many present ways.

In this approach/method we focus on the first stage and decide the output, don't think about the future.

This method may or may not give the best output.

Greedy Algorithm solves problems by making the best choice that seems best at the particular moment. Many
optimization problems can be determined using a greedy algorithm. Some issues have no efficient solution,
but a greedy algorithm may provide a solution that is close to optimal. A greedy algorithm works if a problem
exhibits the following two properties:

1. Greedy Choice Property: A globally optimal solution can be reached at by creating a locally optimal
solution. In other words, an optimal solution can be obtained by creating "greedy" choices.

2. Optimal substructure: Optimal solutions contain optimal subsolutions. In other words, answers to
subproblems of an optimal solution are optimal.

Example:
1. machine scheduling

2. Fractional Knapsack Problem

3. Minimum Spanning Tree

4. Huffman Code

5. Job Sequencing

6. Activity Selection Problem

Steps for achieving a Greedy Algorithm are:


1. Feasible: Here we check whether it satisfies all possible constraints or not, to obtain at least one
solution to our problems.
2. Local Optimal Choice: In this, the choice should be the optimum which is selected from the currently
available

3. Unalterable: Once the decision is made, at any subsequence step that option is not altered.

An Activity Selection Problem


The activity selection problem is a mathematical optimization problem. Our first illustration is the problem
of scheduling a resource among several challenge activities. We find a greedy algorithm provides a well
designed and simple method for selecting a maximum- size set of manually compatible activities.

Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources which can be used by
only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each Activity "i" has start time s​i​ and a
finish time f​i​, where s​i​ ≤f​i​. If selected activity "i" take place meanwhile the half-open time interval [s​i​,f​i​).
Activities i and j are compatible if the intervals (s​i​, f​i​) and [s​i​, f​i​) do not overlap (i.e. i and j are compatible if s​i
≥f​i​ or s​i​ ≥f​i​). The activity-selection problem chosen the maximum- size set of mutually consistent activities.

Algorithm Of Greedy- Activity Selector:


GREEDY- ACTIVITY SELECTOR (s, f)
1. n ← length [s]
2. A ← {1}
3. j ← 1.
4. for i ← 2 to n
5. do if s​i​ ≥ f​i
6. then A ← A ∪ {i}
7. j ← i
8. return A

Example: Given 10 activities along with their start and end time as

S = (A​1​ A​2​ A​3​ A​4​ A​5​ A​6​ A​7​ A​8​ ​A9 A10)

Si = (1,2,3,4,7,8,9,9,11,12)

fi = (3,5,4,7,10,9,11,13,12,14)
Compute a schedule where the greatest number of activities takes place.

Solution: The solution to the above Activity scheduling problem using a greedy strategy is illustrated below:

Arranging the activities in increasing order of end time

Now, schedule A​1

Next schedule A​3​ as A​1​ and A​3​ are non-interfering.

Next skip A​2​ as it is interfering.

Next, schedule A​4​ as A​1​ A​3​ and A​4​ are non-interfering, then next, schedule A​6​ as A​1​ A​3​ A​4​ and A​6​ are
non-interfering.
Skip A​5​ as it is interfering.

Next, schedule A​7​ as A​1​ A​3​ A​4​ A​6​ and A​7​ are non-interfering.

Next, schedule A​9​ as A​1​ A​3​ A​4​ A​6​ A​7​ and A​9​ are non-interfering.

Skip A​8​ as it is interfering.

Next, schedule A​10​ as A​1​ A​3​ A​4​ A​6​ A​7​ A​9​ and A​10​ are non-interfering.

Thus the final Activity schedule is:

Prim’s Algorithm-

● Prim’s Algorithm is a famous greedy algorithm.


● It is used for finding the Minimum Spanning Tree (MST) of a given graph.
● To apply Prim’s algorithm, the given graph must be weighted, connected and
undirected.

Prim’s Algorithm Implementation-

The implementation of Prim’s Algorithm is explained in the following steps-

Step-01:

● Randomly choose any vertex.


● The vertex connecting to the edge having least weight is usually selected.
Step-02:

● Find all the edges that connect the tree to new vertices.
● Find the least weight edge among those edges and include it in the existing tree.
● If including that edge creates a cycle, then reject that edge and look for the next least
weight edge.

Step-03:

● Keep repeating step-02 until all the vertices are included and Minimum Spanning
Tree (MST) is obtained.

Prim’s Algorithm Time Complexity-

Worst case time complexity of Prim’s Algorithm is-

● O(ElogV) using binary heap


● O(E + VlogV) using Fibonacci heap
Time Complexity Analysis

● If adjacency list is used to represent the graph, then using breadth first search, all
the vertices can be traversed in O(V + E) time.
● We traverse all the vertices of graph using breadth first search and use a min
heap for storing the vertices not yet included in the MST.
● To get the minimum weight edge, we use min heap as a priority queue.
● Min heap operations like extracting minimum element and decreasing key value
takes O(logV) time.

So, overall time complexity

= O(E + V) x O(logV)

= O((E + V)logV)

= O(ElogV)

This time complexity can be improved and reduced to O(E + VlogV) using Fibonacci heap.

PRACTICE PROBLEMS BASED ON PRIM’S ALGORITHM-

Problem-01:

Construct the minimum spanning tree (MST) for the given graph using Prim’s Algorithm-
Solution-

The above discussed steps are followed to find the minimum cost spanning tree using Prim’s
Algorithm-

Step-01:

Step-02:
Step-03:

Step-04:
Step-05:

Step-06:
Since all the vertices have been included in the MST, so we stop.

Now, Cost of Minimum Spanning Tree

= Sum of all edge weights

= 10 + 25 + 22 + 12 + 16 + 14

= 99 units

Problem-02:

Using Prim’s Algorithm, find the cost of minimum spanning tree (MST) of the given graph-
Solution-

The minimum spanning tree obtained by the application of Prim’s Algorithm on the given
graph is as shown below-
Now, Cost of Minimum Spanning Tree

= Sum of all edge weights

= 1 + 4 + 2 + 6 + 3 + 10

= 26 units

Kruskal’s Algorithm-

● Kruskal’s Algorithm is a famous greedy algorithm.


● It is used for finding the Minimum Spanning Tree (MST) of a given graph.
● To apply Kruskal’s algorithm, the given graph must be weighted, connected and
undirected.

Kruskal’s Algorithm Implementation-

The implementation of Kruskal’s Algorithm is explained in the following steps-

Step-01:

● Sort all the edges from low weight to high weight.

Step-02:

● Take the edge with the lowest weight and use it to connect the vertices of graph.
● If adding an edge creates a cycle, then reject that edge and go for the next least
weight edge.

Step-03:

● Keep adding edges until all the vertices are connected and a Minimum Spanning Tree
(MST) is obtained.

Thumb Rule to Remember

The above steps may be reduced to the following thumb rule-

● Simply draw all the vertices on the paper.


● Connect these vertices using edges with minimum weights such that no cycle gets
formed.

Kruskal’s Algorithm Time Complexity-

Worst case time complexity of Kruskal’s Algorithm

= O(ElogV) or O(ElogE)
Analysis-

● The edges are maintained as min heap.


● The next edge can be obtained in O(logE) time if graph has E edges.
● Reconstruction of heap takes O(E) time.
● So, Kruskal’s Algorithm takes O(ElogE) time.
● The value of E can be at most O(V​2​).
● So, O(logV) and O(logE) are same.

Special Case-

● If the edges are already sorted, then there is no need to construct min heap.
● So, deletion from min heap time is saved.
● In this case, time complexity of Kruskal’s Algorithm = O(E + V)

PRACTICE PROBLEMS BASED ON KRUSKAL’S


ALGORITHM-

Problem-01:

Construct the minimum spanning tree (MST) for the given graph using Kruskal’s Algorithm-
Solution-

To construct MST using Kruskal’s Algorithm,

● Simply draw all the vertices on the paper.


● Connect these vertices using edges with minimum weights such that no cycle gets
formed.

Step-01:
Step-02:

Step-03:
Step-04:

Step-05:
Step-06:

Step-07:
Since all the vertices have been connected / included in the MST, so we stop.

Weight of the MST

= Sum of all edge weights

= 10 + 25 + 22 + 12 + 16 + 14

= 99 units

Prim’s and Kruskal’s Algorithms-

We have discussed-

● Prim’s and Kruskal’s Algorithm are the famous greedy algorithms.


● They are used for finding the Minimum Spanning Tree (MST) of a given graph.
● To apply these algorithms, the given graph must be weighted, connected and
undirected.

Some important concepts based on them are-

Concept-01:
If all the edge weights are distinct, then both the algorithms are guaranteed to find the same
MST.

Example-

Consider the following example-

Here, both the algorithms on the above given graph produces the same MST as shown.

Concept-02:

● If all the edge weights are not distinct, then both the algorithms may not always
produce the same MST.
● However, cost of both the MST​s​ would always be same in both the cases.

Example-

Consider the following example-


Here, both the algorithms on the above given graph produces different MST​s​ as shown but the
cost is same in both the cases.

Concept-03:

Kruskal’s Algorithm is preferred when-

● The graph is sparse.


● There are less number of edges in the graph like E = O(V)
● The edges are already sorted or can be sorted in linear time.

Prim’s Algorithm is preferred when-

● The graph is dense.


● There are large number of edges in the graph like E = O(V​2​).

Concept-04:

Difference between Prim’s Algorithm and Kruskal’s Algorithm-

Prim’s Algorithm Kruskal’s Algorithm

The tree that we are making or growing The tree that we are making or growing
always remains connected. usually remains disconnected.
Prim’s Algorithm grows a solution from a Kruskal’s Algorithm grows a solution from
random vertex by adding the next cheapest the cheapest edge by adding the next
vertex to the existing tree. cheapest edge to the existing tree.

Prim’s Algorithm is faster for dense graphs. Kruskal’s Algorithm is faster for sparse
graphs.

Shortest Path Problem-

● Shortest path problem is a problem of finding the shortest path(s) between vertices of
a given graph.
● Shortest path between two vertices is a path that has the least cost as compared to all
other existing paths.

Shortest Path Algorithms-

Shortest path algorithms are a family of algorithms used for solving the shortest path problem.

Applications-

● Google Maps
● Road Networks
● Logistics Research

Types of Shortest Path Problem-

Various types of shortest path problem are-

1. Single-pair shortest path problem


2. Single-source shortest path problem
3. Single-destination shortest path problem
4. All pairs shortest path problem
Single-Pair Shortest Path Problem-

● It is a shortest path problem where the shortest path between a given pair of vertices
is computed.
● A* Search Algorithm is a famous algorithm used for solving single-pair shortest path
problem.

Single-Source Shortest Path Problem-

● It is a shortest path problem where the shortest path from a given source vertex to all
other remaining vertices is computed.
● Dijkstra’s Algorithm and Bellman Ford Algorithm are the famous algorithms used
for solving single-source shortest path problem.

Single-Destination Shortest Path Problem-

● It is a shortest path problem where the shortest path from all the vertices to a single
destination vertex is computed.
● By reversing the direction of each edge in the graph, this problem reduces to
single-source shortest path problem.
● Dijkstra’s Algorithm is a famous algorithm adapted for solving single-destination
shortest path problem.
All Pairs Shortest Path Problem-

● It is a shortest path problem where the shortest path between every pair of vertices is
computed.
● Floyd-Warshall Algorithm and Johnson’s Algorithm are the famous algorithms used
for solving All pairs shortest path problem.

Dijkstra Algorithm-

● Dijkstra Algorithm is a very famous greedy algorithm.


● It is used for solving the single source shortest path problem.
● It computes the shortest path from one particular source node to all other remaining nodes
of the graph.

Conditions-

It is important to note the following points regarding Dijkstra Algorithm-

● Dijkstra algorithm works only for connected graphs.


● Dijkstra algorithm works only for those graphs that do not contain any negative weight
edge.
● The actual Dijkstra algorithm does not output the shortest paths.
● It only provides the value or cost of the shortest paths.
● By making minor modifications in the actual algorithm, the shortest paths can be easily
obtained.
● Dijkstra algorithm works for directed as well as undirected graphs.

Step-01:

In the first step. two sets are defined-

● One set contains all those vertices which have been included in the shortest path tree.
● In the beginning, this set is empty.
● Other set contains all those vertices which are still left to be included in the shortest
path tree.
● In the beginning, this set contains all the vertices of the given graph.

Step-02:

For each vertex of the given graph, two variables are defined as-

● Π[v] which denotes the predecessor of vertex ‘v’


● d[v] which denotes the shortest path estimate of vertex ‘v’ from the source vertex.

Initially, the value of these variables is set as-

● The value of variable ‘Π’ for each vertex is set to NIL i.e. Π[v] = NIL
● The value of variable ‘d’ for source vertex is set to 0 i.e. d[S] = 0
● The value of variable ‘d’ for remaining vertices is set to ∞ i.e. d[v] = ∞

Step-03:
The following procedure is repeated until all the vertices of the graph are processed-

● Among unprocessed vertices, a vertex with minimum value of variable ‘d’ is chosen.
● Its outgoing edges are relaxed.
● After relaxing the edges for that vertex, the sets created in step-01 are updated.

What is Edge Relaxation?

Consider the edge (a,b) in the following graph-

Here, d[a] and d[b] denotes the shortest path estimate for vertices a and b respectively from the
source vertex ‘S’.

Now,

If d[a] + w < d[b]

then d[b] = d[a] + w and Π[b] = a

This is called as edge relaxation.


Time Complexity Analysis-

Case-01:

This case is valid when-

● The given graph G is represented as an adjacency matrix.


● Priority queue Q is represented as an unordered list.

Here,

● A[i,j] stores the information about edge (i,j).


● Time taken for selecting i with the smallest dist is O(V).
● For each neighbor of i, time taken for updating dist[j] is O(1) and there will be
maximum V neighbors.
● Time taken for each iteration of the loop is O(V) and one vertex is deleted from Q.
● Thus, total time complexity becomes O(V​2​).

Case-02:

This case is valid when-

● The given graph G is represented as an adjacency list.


● Priority queue Q is represented as a binary heap.

Here,
● With adjacency list representation, all vertices of the graph can be traversed using
BFS in O(V+E) time.
● In min heap, operations like extract-min and decrease-key value takes O(logV) time.
● So, overall time complexity becomes O(E+V) x O(logV) which is O((E + V) x logV) =
O(ElogV)
● This time complexity can be reduced to O(E+VlogV) using Fibonacci heap.

PRACTICE PROBLEM BASED ON DIJKSTRA


ALGORITHM-

Problem-

Using Dijkstra’s Algorithm, find the shortest distance from source vertex ‘S’ to remaining
vertices in the following graph-
Also, write the order in which the vertices are visited.

Solution-

Step-01:

The following two sets are created-

● Unvisited set : {S , a , b , c , d , e}
● Visited set :{}

Step-02:

The two variables Π and d are created for each vertex and initialized as-

● Π[S] = Π[a] = Π[b] = Π[c] = Π[d] = Π[e] = NIL


● d[S] = 0
● d[a] = d[b] = d[c] = d[d] = d[e] = ∞

Step-03:

● Vertex ‘S’ is chosen.


● This is because shortest path estimate for vertex ‘S’ is least.
● The outgoing edges of vertex ‘S’ are relaxed.
Before Edge Relaxation-

Now,

● d[S] + 1 = 0 + 1 = 1 < ∞

∴ d[a] = 1 and Π[a] = S

● d[S] + 5 = 0 + 5 = 5 < ∞

∴ d[b] = 5 and Π[b] = S

After edge relaxation, our shortest path tree is-


Now, the sets are updated as-

● Unvisited set : {a , b , c , d , e}
● Visited set : {S}

Step-04:

● Vertex ‘a’ is chosen.


● This is because shortest path estimate for vertex ‘a’ is least.
● The outgoing edges of vertex ‘a’ are relaxed.

Before Edge Relaxation-


Now,

● d[a] + 2 = 1 + 2 = 3 < ∞

∴ d[c] = 3 and Π[c] = a

● d[a] + 1 = 1 + 1 = 2 < ∞

∴ d[d] = 2 and Π[d] = a

● d[b] + 2 = 1 + 2 = 3 < 5

∴ d[b] = 3 and Π[b] = a

After edge relaxation, our shortest path tree is-


Now, the sets are updated as-

● Unvisited set : {b , c , d , e}
● Visited set : {S , a}

Step-05:

● Vertex ‘d’ is chosen.


● This is because shortest path estimate for vertex ‘d’ is least.
● The outgoing edges of vertex ‘d’ are relaxed.

Before Edge Relaxation-


Now,

● d[d] + 2 = 2 + 2 = 4 < ∞

∴ d[e] = 4 and Π[e] = d

After edge relaxation, our shortest path tree is-


Now, the sets are updated as-

● Unvisited set : {b , c , e}
● Visited set : {S , a , d}

Step-06:

● Vertex ‘b’ is chosen.


● This is because shortest path estimate for vertex ‘b’ is least.
● Vertex ‘c’ may also be chosen since for both the vertices, shortest path estimate is
least.
● The outgoing edges of vertex ‘b’ are relaxed.

Before Edge Relaxation-

Now,

● d[b] + 2 = 3 + 2 = 5 > 2

∴ No change

After edge relaxation, our shortest path tree remains the same as in Step-05.
Now, the sets are updated as-

● Unvisited set : {c , e}
● Visited set : {S , a , d , b}

Step-07:

● Vertex ‘c’ is chosen.


● This is because shortest path estimate for vertex ‘c’ is least.
● The outgoing edges of vertex ‘c’ are relaxed.

Before Edge Relaxation-

Now,

● d[c] + 1 = 3 + 1 = 4 = 4

∴ No change
After edge relaxation, our shortest path tree remains the same as in Step-05.

Now, the sets are updated as-

● Unvisited set : {e}


● Visited set : {S , a , d , b , c}

Step-08:

● Vertex ‘e’ is chosen.


● This is because shortest path estimate for vertex ‘e’ is least.
● The outgoing edges of vertex ‘e’ are relaxed.
● There are no outgoing edges for vertex ‘e’.
● So, our shortest path tree remains the same as in Step-05.

Now, the sets are updated as-

● Unvisited set : { }
● Visited set : {S , a , d , b , c , e}

Now,

● All vertices of the graph are processed.


● Our final shortest path tree is as shown below.
● It represents the shortest path from source vertex ‘S’ to all other remaining vertices.
Knapsack Problem-

You are given the following-

● A knapsack (kind of shoulder bag) with limited weight capacity.


● Few items each having some weight and value.

The problem states-

Which items should be placed into the knapsack such that-

● The value or profit obtained by putting the items into the knapsack is maximum.
● And the weight limit of the knapsack does not exceed.
Knapsack Problem Variants-

Knapsack problem has the following two variants-

1. Fractional Knapsack Problem


2. 0/1 Knapsack Problem

In this article, we will discuss about Fractional Knapsack Problem.

Fractional Knapsack Problem-


In Fractional Knapsack Problem,

● As the name suggests, items are divisible here.


● We can even put the fraction of any item into the knapsack if taking the complete
item is not possible.
● It is solved using Greedy Method.

Fractional Knapsack Problem Using Greedy Method-

Fractional knapsack problem is solved using greedy method in the following steps-

Step-01:

For each item, compute its value / weight ratio.

Step-02:

Arrange all the items in decreasing order of their value / weight ratio.

Step-03:

Start putting the items into the knapsack beginning from the item with the highest ratio.
Put as many items as you can into the knapsack.

Time Complexity-

● The main time taking step is the sorting of all items in decreasing order of their value
/ weight ratio.
● If the items are already arranged in the required order, then while loop takes O(n)
time.
● The average time complexity of ​Quick Sort​ is O(nlogn).
● Therefore, total time taken including the sort is O(nlogn).

PRACTICE PROBLEM BASED ON FRACTIONAL


KNAPSACK PROBLEM-

Problem-

For the given set of items and knapsack capacity = 60 kg, find the optimal solution for the
fractional knapsack problem making use of greedy approach.

Item Weight Value

1 5 30
2 10 40

3 15 45

4 22 77

5 25 90

OR

Find the optimal solution for the fractional knapsack problem making use of greedy approach.
Consider-

n=5

w = 60 kg

(w1, w2, w3, w4, w5) = (5, 10, 15, 22, 25)

(b1, b2, b3, b4, b5) = (30, 40, 45, 77, 90)

OR

A thief enters a house for robbing it. He can carry a maximal weight of 60 kg into his bag.
There are 5 items in the house with the following weights and values. What items should thief
take if he can even take the fraction of any item with him?

Item Weight Value


1 5 30

2 10 40

3 15 45

4 22 77

5 25 90

Solution-

Step-01:

Compute the value / weight ratio for each item-

Items Weight Value Ratio

1 5 30 6

2 10 40 4

3 15 45 3

4 22 77 3.5

5 25 90 3.6

Step-02:
Sort all the items in decreasing order of their value / weight ratio-

I1 I2 I5 I4 I3

(6) (4) (3.6) (3.5) (3)

Step-03:

Start filling the knapsack by putting the items into it one by one.

Knapsack Items in Cost


Weight Knapsack

60 Ø 0

55 I1 30

45 I1, I2 70

20 I1, I2, I5 160

Now,

● Knapsack weight left to be filled is 20 kg but item-4 has a weight of 22 kg.


● Since in fractional knapsack problem, even the fraction of any item can be taken.
● So, knapsack will contain the following items-
< I1 , I2 , I5 , (20/22) I4 >

Total cost of the knapsack

= 160 + (20/22) x 77

= 160 + 70

= 230 units

Important Note-

Had the problem been a 0/1 knapsack problem, knapsack would contain the following items-

< I1 , I2 , I5 >

The knapsack’s total cost would be 160 units.

Job Sequencing With Deadlines-

The sequencing of jobs on a single processor with deadline constraints is called as Job
Sequencing with Deadlines.

Here-

● You are given a set of jobs.


● Each job has a defined deadline and some profit associated with it.
● The profit of a job is given only when that job is completed within its deadline.
● Only one processor is available for processing all the jobs.
● Processor takes one unit of time to complete a job.

The problem states-

“How can the total profit be maximized if only one job can be completed at a time?”

Approach to Solution-

● A feasible solution would be a subset of jobs where each job of the subset gets
completed within its deadline.
● Value of the feasible solution would be the sum of profit of all the jobs contained in
the subset.
● An optimal solution of the problem would be a feasible solution which gives the
maximum profit.

Greedy Algorithm-

Greedy Algorithm is adopted to determine how the next job is selected for an optimal solution.

The greedy algorithm described below always gives an optimal solution to the job sequencing
problem-

Step-01:
● Sort all the given jobs in decreasing order of their profit.

Step-02:

● Check the value of maximum deadline.


● Draw a Gantt chart where maximum time on Gantt chart is the value of maximum
deadline.

Step-03:

● Pick up the jobs one by one.


● Put the job on Gantt chart as far as possible from 0 ensuring that the job gets
completed before its deadline.

PRACTICE PROBLEM BASED ON JOB SEQUENCING


WITH DEADLINES-

Problem-

Given the jobs, their deadlines and associated profits as shown-

Jobs J1 J2 J3 J4 J5 J6
Deadlines 5 3 3 2 4 2

Profits 200 180 190 300 120 100

Answer the following questions-

1. Write the optimal schedule that gives maximum profit.


2. Are all the jobs completed in the optimal schedule?
3. What is the maximum earned profit?

Solution-

Step-01:

Sort all the given jobs in decreasing order of their profit-

Jobs J4 J1 J3 J2 J5 J6

Deadlines 2 5 3 3 4 2

Profits 300 200 190 180 120 100

Step-02:

Value of maximum deadline = 5.


So, draw a Gantt chart with maximum time on Gantt chart = 5 units as shown-

Now,

● We take each job one by one in the order they appear in Step-01.
● We place the job on Gantt chart as far as possible from 0.

Step-03:

● We take job J4.


● Since its deadline is 2, so we place it in the first empty cell before deadline 2 as-

Step-04:

● We take job J1.


● Since its deadline is 5, so we place it in the first empty cell before deadline 5 as-

Step-05:

● We take job J3.


● Since its deadline is 3, so we place it in the first empty cell before deadline 3 as-

Step-06:

● We take job J2.


● Since its deadline is 3, so we place it in the first empty cell before deadline 3.
● Since the second and third cells are already filled, so we place job J2 in the first cell
as-
Step-07:

● Now, we take job J5.


● Since its deadline is 4, so we place it in the first empty cell before deadline 4 as-

Now,

● The only job left is job J6 whose deadline is 2.


● All the slots before deadline 2 are already occupied.
● Thus, job J6 can not be completed.

Now, the given questions may be answered as-

Part-01:

The optimal schedule is-


J2 , J4 , J3 , J5 , J1

This is the required order in which the jobs must be completed in order to obtain the maximum
profit.

Part-02:

● All the jobs are not completed in optimal schedule.


● This is because job J6 could not be completed within its deadline.

Part-03:

Maximum earned profit

= Sum of profit of all the jobs in optimal schedule

= Profit of job J2 + Profit of job J4 + Profit of job J3 + Profit of job J5 + Profit of job J1

= 180 + 300 + 190 + 120 + 200

= 990 units

Huffman Coding-

● Huffman Coding is a famous Greedy Algorithm.


● It is used for the lossless compression of data.
● It uses variable length encoding.
● It assigns variable length code to all the characters.
● The code length of a character depends on how frequently it occurs in the given text.
● The character which occurs most frequently gets the smallest code.
● The character which occurs least frequently gets the largest code.
● It is also known as Huffman Encoding.

Prefix Rule-

● Huffman Coding implements a rule known as a prefix rule.


● This is to prevent the ambiguities while decoding.
● It ensures that the code assigned to any character is not a prefix of the code assigned
to any other character.

Major Steps in Huffman Coding-

There are two major steps in Huffman Coding-

1. Building a Huffman Tree from the input characters.


2. Assigning code to the characters by traversing the Huffman Tree.

Huffman Tree-

The steps involved in the construction of Huffman Tree are as follows-


Step-01:

● Create a leaf node for each character of the text.


● Leaf node of a character contains the occurring frequency of that character.

Step-02:

● Arrange all the nodes in increasing order of their frequency value.

Step-03:

Considering the first two nodes having minimum frequency,

● Create a new internal node.


● The frequency of this new node is the sum of frequency of those two nodes.
● Make the first node as a left child and the other node as a right child of the newly
created node.

Step-04:

● Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
● The tree finally obtained is the desired Huffman Tree.
Time Complexity-

The time complexity analysis of Huffman Coding is as follows-

● extractMin( ) is called 2 x (n-1) times if there are n nodes.


● As extractMin( ) calls minHeapify( ), it takes O(logn) time.

Thus, Overall time complexity of Huffman Coding becomes O(nlogn).

Here, n is the number of unique characters in the given text.

Important Formulas-

The following 2 formulas are important to solve the problems based on Huffman Coding-

Formula-01:

Formula-02:
Total number of bits in Huffman encoded message

= Total number of characters in the message x Average code length per character

= ∑ ( frequency​i​ x Code length​i ​)

PRACTICE PROBLEM BASED ON HUFFMAN CODING-

Problem-

A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-

1. Huffman Code for each character


2. Average code length
3. Length of Huffman encoded message (in bits)

Characters Frequencies

a 10

e 15

i 12

o 3

u 4
s 13

t 1

Solution-

First let us construct the Huffman Tree.

Huffman Tree is constructed in the following steps-

Step-01:

Step-02:
Step-03:

Step-04:

Step-05:
Step-06:
Step-07:
Now,

● We assign weight to all the edges of the constructed Huffman Tree.


● Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.

Rule

● If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
● If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
● Any of the above two conventions may be followed.
● But follow the same convention at the time of decoding that is adopted at the time
of encoding.

After assigning weight to all the edges, the modified Huffman Tree is-
Now, let us answer each part of the given problem one by one-

1. Huffman Code For Characters-

To write Huffman Code for any character, traverse the Huffman Tree from root node to the
leaf node of that character.

Following this rule, the Huffman Code for each character is-

● a = 111
● e = 10
● i = 00
● o = 11001
● u = 1101
● s = 01
● t = 11000

From here, we can observe-

● Characters occurring less frequently in the text are assigned the larger code.
● Characters occurring more frequently in the text are assigned the smaller code.

2. Average Code Length-

Using formula-01, we have-

Average code length

= ∑ ( frequency​i​ x code length​i​ ) / ∑ ( frequency​i​ )

= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 + 4 + 13


+ 1)

= 2.52

3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message

= Total number of characters in the message x Average code length per character

= 58 x 2.52

= 146.16

≅ 147 bits

You might also like