0% found this document useful (0 votes)
3 views71 pages

UNIT-IV & V Advanced Data Structures and Algorithms

It is anna University notes for data structure and algorithm

Uploaded by

Tamilselvam S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views71 pages

UNIT-IV & V Advanced Data Structures and Algorithms

It is anna University notes for data structure and algorithm

Uploaded by

Tamilselvam S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

UNIT-IV Algorithm Design Techniques

Dynamic Programming:
Dynamic programming is a technique that breaks the problems into sub-problems, and
saves the result for future purposes so that we do not need to compute the result again. The
subproblems are optimized to optimize the overall solution is known as optimal substructure
property. The main use of dynamic programming is to solve optimization problems. Here,
optimization problems mean that when we are trying to find out the minimum or the
maximum solution of a problem. The dynamic programming guarantees to find the optimal
solution of a problem if the solution exists.

The definition of dynamic programming says that it is a technique for solving a


complex problem by first breaking into a collection of simpler subproblems, solving each
subproblem just once, and then storing their solutions to avoid repetitive computations.

Let's understand this approach through an example.

Consider an example of the Fibonacci series. The following series is the Fibonacci
series:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ,…

The numbers in the above series are not randomly calculated. Mathematically, we
could write each of the terms using the below formula:

F(n) = F(n-1) + F(n-2),

With the base values F(0) = 0, and F(1) = 1. To calculate the other numbers, we follow
the above relationship. For example, F(2) is the sum f(0) and f(1), which is equal to 1.

How can we calculate F(20)?

The F(20) term will be calculated using the nth formula of the Fibonacci series. The below
figure shows that how F(20) is calculated.
As we can observe in the above figure that F(20) is calculated as the sum of F(19) and
F(18). In the dynamic programming approach, we try to divide the problem into the similar
subproblems. We are following this approach in the above case where F(20) into the similar
subproblems, i.e., F(19) and F(18). If we recap the definition of dynamic programming that it
says the similar subproblem should not be computed more than once. Still, in the above case,
the subproblem is calculated twice. In the above example, F(18) is calculated two times;
similarly, F(17) is also calculated twice. However, this technique is quite useful as it solves
the similar subproblems, but we need to be cautious while storing the results because we are
not particular about storing the result that we have computed once, then it can lead to a
wastage of resources.

In the above example, if we calculate the F(18) in the right subtree, then it leads to the
tremendous usage of resources and decreases the overall performance.

The solution to the above problem is to save the computed results in an array. First, we
calculate F(16) and F(17) and save their values in an array. The F(18) is calculated by
summing the values of F(17) and F(16), which are already saved in an array. The computed
value of F(18) is saved in an array. The value of F(19) is calculated using the sum of F(18),
and F(17), and their values are already saved in an array. The computed value of F(19) is
stored in an array. The value of F(20) can be calculated by adding the values of F(19) and
F(18), and the values of both F(19) and F(18) are stored in an array. The final computed
value of F(20) is stored in an array.

How does the dynamic programming approach work?

The following are the steps that the dynamic programming follows:

o It breaks down the complex problem into simpler subproblems.


o It finds the optimal solution to these sub-problems.
o It stores the results of subproblems (memoization). The process of storing the results
of subproblems is known as memorization.
o It reuses them so that same sub-problem is calculated more than once.
o Finally, calculate the result of the complex problem.

The above five steps are the basic steps for dynamic programming. The dynamic
programming is applicable that are having properties such as:

Those problems that are having overlapping subproblems and optimal substructures. Here,
optimal substructure means that the solution of optimization problems can be obtained by
simply combining the optimal solution of all the subproblems.

In the case of dynamic programming, the space complexity would be increased as we are
storing the intermediate results, but the time complexity would be decreased.

Approaches of dynamic programming

There are two approaches to dynamic programming:


o Top-down approach
o Bottom-up approach

Top-down approach

The top-down approach follows the memorization technique, while bottom-up approach
follows the tabulation method. Here memorization is equal to the sum of recursion and
caching. Recursion means calling the function itself, while caching means storing the
intermediate results.

Advantages

o It is very easy to understand and implement.


o It solves the subproblems only when it is required.
o It is easy to debug.

Disadvantages

It uses the recursion technique that occupies more memory in the call stack. Sometimes when
the recursion is too deep, the stack overflow condition will occur.

It occupies more memory that degrades the overall performance.

Let's understand dynamic programming through an example.

1. int fib(int n)
2. {
3. if(n<0)
4. error;
5. if(n==0)
6. return 0;
7. if(n==1)
8. return 1;
9. sum = fib(n-1) + fib(n-2);
10. }

In the above code, we have used the recursive approach to find out the Fibonacci series.
When the value of 'n' increases, the function calls will also increase, and computations will
also increase. In this case, the time complexity increases exponentially, and it becomes 2 n.

One solution to this problem is to use the dynamic programming approach. Rather than
generating the recursive tree again and again, we can reuse the previously calculated value. If
we use the dynamic programming approach, then the time complexity would be O(n).

When we apply the dynamic programming approach in the implementation of the Fibonacci
series, then the code would look like:
1. static int count = 0;
2. int fib(int n)
3. {
4. if(memo[n]!= NULL)
5. return memo[n];
6. count++;
7. if(n<0)
8. error;
9. if(n==0)
10. return 0;
11. if(n==1)
12. return 1;
13. sum = fib(n-1) + fib(n-2);
14. memo[n] = sum;
15. }

In the above code, we have used the memorization technique in which we store the results in
an array to reuse the values. This is also known as a top-down approach in which we move
from the top and break the problem into sub-problems.

Bottom-Up approach

The bottom-up approach is also one of the techniques which can be used to implement the
dynamic programming. It uses the tabulation technique to implement the dynamic
programming approach. It solves the same kind of problems but it removes the recursion. If
we remove the recursion, there is no stack overflow issue and no overhead of the recursive
functions. In this tabulation technique, we solve the problems and store the results in a
matrix.

There are two ways of applying dynamic programming:

o Top-Down
o Bottom-Up

The bottom-up is the approach used to avoid the recursion, thus saving the memory space.
The bottom-up is an algorithm that starts from the beginning, whereas the recursive
algorithm starts from the end and works backward. In the bottom-up approach, we start from
the base case to find the answer for the end. As we know, the base cases in the Fibonacci
series are 0 and 1. Since the bottom approach starts from the base cases, so we will start from
0 and 1.

Key points

o We solve all the smaller sub-problems that will be needed to solve the larger sub-
problems then move to the larger problems using smaller sub-problems.
o We use for loop to iterate over the sub-problems.
o The bottom-up approach is also known as the tabulation or table filling method.

Let's understand through an example.

Suppose we have an array that has 0 and 1 values at a[0] and a[1] positions, respectively
shown as below:

Since the bottom-up approach starts from the lower values, so the values at a[0] and a[1] are
added to find the value of a[2] shown as below:

The value of a[3] will be calculated by adding a[1] and a[2], and it becomes 2 shown as
below:

The value of a[4] will be calculated by adding a[2] and a[3], and it becomes 3 shown as
below:

The value of a[5] will be calculated by adding the values of a[4] and a[3], and it becomes 5
shown as below:

The code for implementing the Fibonacci series using the bottom-up approach is given
below:

1. int fib(int n)
2. {
3. int A[];
4. A[0] = 0, A[1] = 1;
5. for( i=2; i<=n; i++)
6. {
7. A[i] = A[i-1] + A[i-2]
8. }
9. return A[n];
10. }

In the above code, base cases are 0 and 1 and then we have used for loop to find other values
of Fibonacci series.

Let's understand through the diagrammatic representation.

Initially, the first two values, i.e., 0 and 1 can be represented as:

When i=2 then the values 0 and 1 are added shown as below:

When i=3 then the values 1and 1 are added shown as below:

When i=4 then the values 2 and 1 are added shown as below:
When i=5, then the values 3 and 2 are added shown as below:

In the above case, we are starting from the bottom and reaching to the top.

Algorithm of Matrix Chain Multiplication:


MATRIX-CHAIN-ORDER (p)

1. n length[p]-1
2. for i ← 1 to n
3. do m [i, i] ← 0
4. for l ← 2 to n // l is the chain length
5. do for i ← 1 to n-l + 1
6. do j ← i+ l -1
7. m[i,j] ← ∞
8. for k ← i to j-1
9. do q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
10. If q < m [i,j]
11. then m [i,j] ← q
12. s [i,j] ← k
13. return m and s.

We will use table s to construct an optimal solution.

Step 1: Constructing an Optimal Solution:


PRINT-OPTIMAL-PARENS (s, i, j)
1. if i=j
2. then print "A"
3. else print "("
4. PRINT-OPTIMAL-PARENS (s, i, s [i, j])
5. PRINT-OPTIMAL-PARENS (s, s [i, j] + 1, j)
6. print ")"

Analysis: There are three nested loops. Each loop executes a maximum n times.

1. l, length, O (n) iterations.


2. i, start, O (n) iterations.
3. k, split point, O (n) iterations

Body of loop constant complexity

Total Complexity is: O (n3)

Algorithm with Explained Example

Question: P [7, 1, 5, 4, 2}

Solution: Here, P is the array of a dimension of matrices.

So here we will have 4 matrices:

A17x1 A21x5 A35x4 A44x2


i.e.
First Matrix A1 have dimension 7 x 1
Second Matrix A2 have dimension 1 x 5
Third Matrix A3 have dimension 5 x 4
Fourth Matrix A4 have dimension 4 x 2

Let say,
From P = {7, 1, 5, 4, 2} - (Given)
And P is the Position
p0 = 7, p1 =1, p2 = 5, p3 = 4, p4=2.
Length of array P = number of elements in P
∴length (p)= 5
From step 3
Follow the steps in Algorithm in Sequence
According to Step 1 of Algorithm Matrix-Chain-Order
Step 1:

n ← length [p]-1
Where n is the total number of elements
And length [p] = 5
∴n=5-1=4
n=4
Now we construct two tables m and s.
Table m has dimension [1.....n, 1.......n]
Table s has dimension [1.....n-1, 2.......n]

Now, according to step 2 of Algorithm

1. for i ← 1 to n
2. this means: for i ← 1 to 4 (because n =4)
3. for i=1
4. m [i, i]=0
5. m [1, 1]=0
6. Similarly for i = 2, 3, 4
7. m [2, 2] = m [3,3] = m [4,4] = 0
8. i.e. fill all the diagonal entries "0" in the table m
9. Now,
10. l ← 2 to n
11. l ← 2 to 4 (because n =4 )

Case 1:

1. When l - 2

for (i ← 1 to n - l + 1)
i ← 1 to 4 - 2 + 1
i ← 1 to 3

When i = 1
do j ← i + l - 1
j←1+2-1
j←2
i.e. j = 2
Now, m [i, j] ← ∞
i.e. m [1,2] ← ∞
Put ∞ in m [1, 2] table
for k ← i to j-1
k ← 1 to 2 - 1
k ← 1 to 1
k=1
Now q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
for l = 2
i=1
j =2
k=1
q ← m [1,1] + m [2,2] + p0x p1x p2
and m [1,1] = 0
for i ← 1 to 4
∴q←0+0+7x1x5
q ← 35
We have m [i, j] = m [1, 2] = ∞
Comparing q with m [1, 2]
q < m [i, j]
i.e. 35 < m [1, 2]
35 < ∞
True
then, m [1, 2 ] ← 35 (∴ m [i,j] ← q)
s [1, 2] ← k
and the value of k = 1
s [1,2 ] ← 1
Insert "1" at dimension s [1, 2] in table s. And 35 at m [1, 2]

2. l remains 2

L=2
i ← 1 to n - l + 1
i ← 1 to 4 - 2 + 1
i ← 1 to 3
for i = 1 done before
Now value of i becomes 2
i=2
j← i+l-1
j← 2+2-1
j← 3
j=3
m [i , j] ← ∞
i.e. m [2,3] ← ∞
Initially insert ∞ at m [2, 3]
Now, for k ← i to j - 1
k ← 2 to 3 - 1
k ← 2 to 2
i.e. k =2
Now, q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
For l =2
i=2
j=3
k=2
q ← m [2, 2] + m [3, 3] + p1x p2 x p3
q←0+0+1x5x4
q ← 20
Compare q with m [i ,j ]
If q < m [i,j]
i.e. 20 < m [2, 3]
20 < ∞
True
Then m [i,j ] ← q
m [2, 3 ] ← 20
and s [2, 3] ← k
and k = 2
s [2,3] ← 2

3. Now i become 3

i=3
l=2
j←i+l-1
j←3+2-1
j←4
j=4
Now, m [i, j ] ← ∞
m [3,4 ] ← ∞
Insert ∞ at m [3, 4]
for k ← i to j - 1
k ← 3 to 4 - 1
k ← 3 to 3
i.e. k = 3
Now, q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
i=3
l=2
j=4
k=3
q ← m [3, 3] + m [4,4] + p2 x p3 x p4
q←0+0+5x2x4
q 40
Compare q with m [i, j]
If q < m [i, j]
40 < m [3, 4]
40 < ∞
True
Then, m [i,j] ← q
m [3,4] ← 40
and s [3,4] ← k
s [3,4] ← 3

Case 2: l becomes 3

L=3
for i = 1 to n - l + 1
i = 1 to 4 - 3 + 1
i = 1 to 2
When i = 1
j←i+l-1
j←1+3-1
j←3
j=3
Now, m [i,j] ← ∞
m [1, 3] ← ∞
for k ← i to j - 1
k ← 1 to 3 - 1
k ← 1 to 2

Now we compare the value for both k=1 and k = 2. The minimum of two will be placed in m
[i,j] or s [i,j] respectively.

Now from above

Value of q become minimum for k=1


∴ m [i,j] ← q
m [1,3] ← 48
Also m [i,j] > q
i.e. 48 < ∞
∴ m [i , j] ← q
m [i, j] ← 48
and s [i,j] ← k
i.e. m [1,3] ← 48
s [1, 3] ← 1
Now i become 2
i=2
l=3
then j ← i + l -1
j← 2+3-1
j←4
j=4
so m [i,j] ← ∞
m [2,4] ← ∞
Insert initially ∞ at m [2, 4]
for k ← i to j-1
k ← 2 to 4 - 1
k ← 2 to 3

Here, also find the minimum value of m [i,j] for two values of k = 2 and k =3

1. But 28 < ∞
2. So m [i,j] ← q
3. And q ← 28
4. m [2, 4] ← 28
5. and s [2, 4] ← 3
6. e. It means in s table at s [2,4] insert 3 and at m [2,4] insert 28.

Case 3: l becomes 4

L=4
For i ← 1 to n-l + 1
i ← 1 to 4 - 4 + 1
i←1
i=1
do j ← i + l - 1
j←1+4-1
j←4
j=4
Now m [i,j] ← ∞
m [1,4] ← ∞
for k ← i to j -1
k ← 1 to 4 - 1
k ← 1 to 3
When k = 1
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1,1] + m [2,4] + p0xp4x p1
q ← 0 + 28 + 7 x 2 x 1
q ← 42
Compare q and m [i, j]
m [i,j] was ∞
i.e. m [1,4]
if q < m [1,4]
42< ∞
True
Then m [i,j] ← q
m [1,4] ← 42
and s [1,4] 1 ? k =1
When k = 2
L = 4, i=1, j = 4
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1, 2] + m [3,4] + p0 xp2 xp4
q ← 35 + 40 + 7 x 5 x 2
q ← 145
Compare q and m [i,j]
Now m [i, j]
i.e. m [1,4] contains 42.
So if q < m [1, 4]
But 145 less than or not equal to m [1, 4]
So 145 less than or not equal to 42.
So no change occurs.
When k = 3
l=4
i=1
j=4
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1, 3] + m [4,4] + p0 xp3 x p4
q ← 48 + 0 + 7 x 4 x 2
q ← 114
Again q less than or not equal to m [i, j]
i.e. 114 less than or not equal to m [1, 4]
114 less than or not equal to 42

So no change occurs. So the value of m [1, 4] remains 42. And value of s [1, 4] = 1

Now we will make use of only s table to get an optimal solution.

Example of Matrix Chain Multiplication:


Example: We are given the sequence {4, 10, 3, 12, 20, and 7}. The matrices have size 4 x
10, 10 x 3, 3 x 12, 12 x 20, 20 x 7. We need to compute M [i,j], 0 ≤ i, j≤ 5. We know M [i, i]
= 0 for all i.
Let us proceed with working away from the diagonal. We compute the optimal solution for
the product of 2 matrices.

Here P0 to P5 are Position and M1 to M5 are matrix of size (pi to pi-1)

On the basis of sequence, we make a formula

In Dynamic Programming, initialization of every method done by '0'.So we initialize it by


'0'.It will sort out diagonally.

We have to sort out all the combination but the minimum output combination is taken into
consideration.

Calculation of Product of 2 matrices:


1. m (1,2) = m1 x m2
= 4 x 10 x 10 x 3
= 4 x 10 x 3 = 120

2. m (2, 3) = m2 x m3
= 10 x 3 x 3 x 12
= 10 x 3 x 12 = 360

3. m (3, 4) = m3 x m4
= 3 x 12 x 12 x 20
= 3 x 12 x 20 = 720

4. m (4,5) = m4 x m5
= 12 x 20 x 20 x 7
= 12 x 20 x 7 = 1680

o We initialize the diagonal element with equal i,j value with '0'.
o After that second diagonal is sorted out and we get all the values corresponded to it

Now the third diagonal will be solved out in the same way.

Now product of 3 matrices:

M [1, 3] = M1 M2 M3
1. There are two cases by which we can solve this multiplication: ( M 1 x M2) + M3, M1+
(M2x M3)
2. After solving both cases we choose the case in which minimum output is there.

M [1, 3] =264

As Comparing both output 264 is minimum in both cases so we insert 264 in table and ( M1 x
M2) + M3 this combination is chosen for the output making.

M [2, 4] = M2 M3 M4
1. There are two cases by which we can solve this multiplication: (M 2x M3)+M4,
M2+(M3 x M4)
2. After solving both cases we choose the case in which minimum output is there.
M [2, 4] = 1320

As Comparing both output 1320 is minimum in both cases so we insert 1320 in table and
M2+(M3 x M4) this combination is chosen for the output making.

M [3, 5] = M3 M4 M5
1. There are two cases by which we can solve this multiplication: ( M 3 x M4) + M5, M3+ (
M4xM5)
2. After solving both cases we choose the case in which minimum output is there.

M [3, 5] = 1140

As Comparing both output 1140 is minimum in both cases so we insert 1140 in table and (
M3 x M4) + M5this combination is chosen for the output making.

Now Product of 4 matrices:

M [1, 4] = M1 M2 M3 M4

There are three cases by which we can solve this multiplication:

1. ( M1 x M2 x M3) M4
2. M1 x(M2 x M3 x M4)
3. (M1 xM2) x ( M3 x M4)

After solving these cases we choose the case in which minimum output is there

M [1, 4] =1080

As comparing the output of different cases then '1080' is minimum output, so we insert 1080
in the table and (M1 xM2) x (M3 x M4) combination is taken out in output making,

M [2, 5] = M2 M3 M4 M5
There are three cases by which we can solve this multiplication:

1. (M2 x M3 x M4)x M5
2. M2 x( M3 x M4 x M5)
3. (M2 x M3)x ( M4 x M5)

After solving these cases we choose the case in which minimum output is there

M [2, 5] = 1350

As comparing the output of different cases then '1350' is minimum output, so we insert 1350
in the table and M2 x( M3 x M4 xM5)combination is taken out in output making.

Now Product of 5 matrices:

M [1, 5] = M1 M2 M3 M4 M5

There are five cases by which we can solve this multiplication:

1. (M1 x M2 xM3 x M4 )x M5
2. M1 x( M2 xM3 x M4 xM5)
3. (M1 x M2 xM3)x M4 xM5
4. M1 x M2x(M3 x M4 xM5)

After solving these cases we choose the case in which minimum output is there

M [1, 5] = 1344

As comparing the output of different cases then '1344' is minimum output, so we insert 1344
in the table and M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.
Final Output is:

Step 3: Computing Optimal Costs: let us assume that matrix Ai has dimension pi-1x pi for
i=1, 2, 3....n. The input is a sequence (p0,p1,......pn) where length [p] = n+1. The procedure
uses an auxiliary table m [1....n, 1.....n] for storing m [i, j] costs an auxiliary table s [1.....n,
1.....n] that record which index of k achieved the optimal costs in computing m [i, j].

The algorithm first computes m [i, j] ← 0 for i=1, 2, 3.....n, the minimum costs for the chain
of length 1.

Dynamic Programming (DP)


Dynamic Programming (DP) is defined as a technique that solves some particular
type of problems in Polynomial Time. Dynamic Programming solutions are faster than the
exponential brute method and can be easily proved their correctness.

Characteristics of Dynamic Programming Algorithm:


 In general, dynamic programming (DP) is one of the most powerful techniques for
solving a certain class of problems.
 There is an elegant way to formulate the approach and a very simple thinking process,
and the coding part is very easy.
 Essentially, it is a simple idea, after solving a problem with a given input, save the
result as a reference for future use, so you won’t have to re-solve it.. briefly
‘Remember your Past’ :).
 It is a big hint for DP if the given problem can be broken up into smaller sub-problems,
and these smaller subproblems can be divided into still smaller ones, and in this
process, you see some overlapping subproblems.
 Additionally, the optimal solutions to the subproblems contribute to the optimal
solution of the given problem (referred to as the Optimal Substructure Property).
 The solutions to the subproblems are stored in a table or array (memoization) or in a
bottom-up manner (tabulation) to avoid redundant computation.
 The solution to the problem can be constructed from the solutions to the subproblems.
 Dynamic programming can be implemented using a recursive algorithm, where the
solutions to subproblems are found recursively, or using an iterative algorithm, where
the solutions are found by working through the subproblems in a specific order.
Dynamic programming works on following principles:
 Characterize structure of optimal solution, i.e. build a mathematical model of the
solution.
 Recursively define the value of the optimal solution.
 Using bottom-up approach, compute the value of the optimal solution for each possible
subproblems.
 Construct optimal solution for the original problem using information computed in the
previous step.
Applications:
Dynamic programming is used to solve optimization problems. It is used to solve many
real-life problems such as,
(i) Make a change problem
(ii) Knapsack problem
(iii) Optimal binary search tree
What is the difference between a Dynamic programming algorithm and recursion?
 In dynamic programming, problems are solved by breaking them down into smaller
ones to solve the larger ones, while recursion is when a function is called and executed
by itself. While dynamic programming can function without making use of recursion
techniques, since the purpose of dynamic programming is to optimize and accelerate
the process, programmers usually make use of recursion techniques to accelerate and
turn the process efficiently.
 When a function can execute a specific task by calling itself, receive the name of the
recursive function. In order to perform and accomplish the work, this function calls
itself when it has to be executed.
 Using dynamic programming, you can break a problem into smaller parts, called
subproblems, to solve it. Dynamic programming involves solving the problem for the
first time, then using memoization to store the solutions.
 Therefore, the main difference between the two techniques is their intended
use; recursion is used to automate a function, whereas dynamic programming is an
optimization technique used to solve problems.
 Recursive functions recognize when they are needed, execute themselves, then stop
working. When the function identifies the moment it is needed, it calls itself and is
executed; this is called a recursive case. As a result, the function must stop once the
task is completed, known as the base case.
 By establishing states, dynamic programming recognizes the problem and divides it
into sub-problems in order to solve the whole scene. After solving these sub-problems,
or variables, the programmer must establish a mathematical relationship between them.
Last but not least, these solutions and results are stored as algorithms, so they can be
accessed in the future without having to solve the whole problem again.
Techniques to solve Dynamic Programming Problems:
1. Top-Down(Memoization):
Break down the given problem in order to begin solving it. If you see that the problem has
already been solved, return the saved answer. If it hasn’t been solved, solve it and save it.
This is usually easy to think of and very intuitive, This is referred to as Memoization.
2. Bottom-Up(Dynamic Programming):
Analyze the problem and see in what order the subproblems are solved, and work your
way up from the trivial subproblem to the given problem. This process ensures that the
subproblems are solved before the main problem. This is referred to as Dynamic
Programming.
Types of the approach of dynamic programming algorithm

Tabulation(Dynamic Programming) vs Memoization:


There are two different ways to store the values so that the values of a sub-problem can be
reused. Here, will discuss two patterns of solving dynamic programming (DP) problems:
 Tabulation: Bottom Up
 Memoization: Top Down
Before getting to the definitions of the above two terms consider the following statements:
 Version 1: I will study the theory of DP from GeeksforGeeks, then I will practice some
problems on classic DP and hence I will master DP.
 Version 2: To Master DP, I would have to practice Dynamic problems and practice
problems – Firstly, I would have to study some theories of DP from GeeksforGeeks
Both versions say the same thing, the difference simply lies in the way of conveying the
message and that’s exactly what Bottom-Up and Top-Down DP do. Version 1 can be
related to Bottom-Up DP and Version-2 can be related to Top-Down DP.
Tabulation Memoization

State transition relation is difficult State Transition relation is easy


State
to think to think

Code gets complicated when a lot Code is easy and less


Code of complicated
conditions are required

Fast, as we directly access previous Slow due to a lot of recursive


Speed
states from the table calls and return statements

If all subproblems must be solved at If some subproblems in the


Subproblem least once, a bottom-up dynamic subproblem space need not be
solving programming algorithm usually solved at all, the memoized
outperforms a top-down memoized solution has the advantage of
Tabulation Memoization

algorithm by a constant factor solving only those subproblems


that are definitely required

Unlike the Tabulated version,


In the Tabulated version, starting all entries of the lookup table
Table entries from the first entry, all entries are are not necessarily filled in
filled one by one Memoized version. The table is
filled on demand.

Generally, tabulation(dynamic
On the other hand, memoization
Approach programming) is an iterative
is a recursive approach.
approach

How to solve a Dynamic Programming Problem?


To dynamically solve a problem, we need to check two necessary conditions:
 Overlapping Subproblems: When the solutions to the same subproblems are needed
repetitively for solving the actual problem. The problem is said to have overlapping
subproblems property.

N-th Fibonacci Series as Overlapping Subproblems

 Optimal Substructure Property: If the optimal solution of the given problem can be
obtained by using optimal solutions of its subproblems then the problem is said to have
Optimal Substructure Property.
Steps to solve a Dynamic programming problem:
1. Identify if it is a Dynamic programming problem.
2. Decide a state expression with the Least parameters.
3. Formulate state and transition relationships.
4. Do tabulation (or memorization).
1) How to classify a problem as a Dynamic Programming algorithm Problem?
 Typically, all the problems that require maximizing or minimizing certain quantities
or counting problems that say to count the arrangements under certain conditions or
certain probability problems can be solved by using Dynamic Programming.
 All dynamic programming problems satisfy the overlapping subproblems property
and most of the classic Dynamic programming problems also satisfy the optimal
substructure property. Once we observe these properties in a given problem be sure
that it can be solved using Dynamic Programming.
2) Deciding the state:
Problems with dynamic programming are mostly concerned with the state and its
transition. The most fundamental phase must be carried out with extreme care because the
state transition depends on the state definition you select.
State:
A state is a collection of characteristics that can be used to specifically describe a given
position or standing in a given challenge. To minimise state space, this set of parameters
has to be as compact as feasible.
3) Formulating a relation among the states:
The hardest part of a Dynamic Programming challenge is this step, which calls for a lot of
intuition, observation, and training.
Example:
Given 3 numbers {1, 3, 5}, the task is to tell the total number of ways we can form a
number N using the sum of the given three numbers. (allowing repetitions and different
arrangements).
The total number of ways to form 6 is: 8
1+1+1+1+1+1
1+1+1+3
1+1+3+1
1+3+1+1
3+1+1+1
3+3
1+5
5+1
Following are the steps to solve the problem:
 We choose a state for the given problem.
 N will be used as the determining factor for the state because it can be used to identify
any subproblem.
 The DP state will resemble state(N), where the state(N) is the total number of
arrangements required to create N using the elements 1, 3, and 5. Identify the
relationship of the transition between any two states.
 We must now calculate the state (N).
3.1) How to Compute the state?
As we can only use 1, 3, or 5 to form a given number N. Let us assume that we know the
result for N = 1, 2, 3, 4, 5, 6
Let us say we know the result for:
state (n = 1), state (n = 2), state (n = 3) ……… state (n = 6)
Now, we wish to know the result of the state (n = 7). See, we can only add 1, 3, and 5.
Now we can get a sum total of 7 in the following 3 ways:
1) Adding 1 to all possible combinations of state (n = 6)
Eg: [ (1+1+1+1+1+1) + 1]
[ (1+1+1+3) + 1]
[ (1+1+3+1) + 1]
[ (1+3+1+1) + 1]
[ (3+1+1+1) + 1]
[ (3+3) + 1]
[ (1+5) + 1]
[ (5+1) + 1]
2) Adding 3 to all possible combinations of state (n = 4);
[(1+1+1+1) + 3]
[(1+3) + 3]
[(3+1) + 3]
3) Adding 5 to all possible combinations of state(n = 2)
[ (1+1) + 5]
(Note how it sufficient to add only on the right-side – all the add-from-left-side cases are
covered, either in the same state, or another, e.g. [ 1+(1+1+1+3)] is not needed in state
(n=6) because it’s covered by state (n = 4) [(1+1+1+1) + 3])
Now, think carefully and satisfy yourself that the above three cases are covering all
possible ways to form a sum total of 7;
Therefore, we can say that result for
state(7) = state (6) + state (4) + state (2)
OR
state(7) = state (7-1) + state (7-3) + state (7-5)
In general,
state(n) = state(n-1) + state(n-3) + state(n-5)
Below is the implementation for the above approach:
// Returns the number of arrangements to
// form 'n'
int solve(int n)
{
// base case
if (n < 0)
return 0;
if (n == 0)
return 1;

return solve(n-1) + solve(n-3) + solve(n-5);


}
Time Complexity: O(3n), As at every stage we need to take three decisions and the height
of the tree will be of the order of n.
Auxiliary Space: O(n), The extra space is used due to the recursion call stack.
The above code seems exponential as it is calculating the same state again and again. So,
we just need to add memoization.
4) Adding memoization or tabulation for the state
The simplest portion of a solution based on dynamic programming is this. Simply storing
the state solution will allow us to access it from memory the next time that state is needed.
Adding memoization to the above code:
// Initialize to -1
int dp[MAXN];

// This function returns the number of


// arrangements to form 'n'
int solve(int n)
{
// base case
if (n < 0)
return 0;
if (n == 0)
return 1;

// Checking if already calculated


if (dp[n]!=-1)
return dp[n];

// Storing the result and returning


return dp[n] = solve(n-1) + solve(n-3)
+ solve(n-5);
}

Time Complexity: O(n), As we just need to make 3n function calls and there will be no
repetitive calculations as we are returning previously calculated results.
Auxiliary Space: O(n), The extra space is used due to the recursion call stack.
How to solve Dynamic Programming problems through Example?
Problem: Let’s find the Fibonacci sequence up to the nth term. A Fibonacci series is the
sequence of numbers in which each number is the sum of the two preceding ones. For
example, 0, 1, 1, 2, 3, and so on. Here, each number is the sum of the two preceding
numbers.
Naive Approach: The basic way to find the nth Fibonacci number is to use recursion.
Below is the implementation for the above approach:

// C++ code for the above approach:


#include <iostream>
using namespace std;

// Function to find nth fibonacci number


int fib(int n)
{
if (n <= 1) {
return n;
}
int x = fib(n - 1);
int y = fib(n - 2);

return x + y;
}
// Drivers code
int main()
{
int n = 5;

// Function Call
cout << fib(n);
return 0;
}

Output
5

Complexity Analysis:
Time Complexity: O(2n)
 Here, for every n, we are required to make a recursive call to fib(n – 1) and fib(n – 2).
For fib(n – 1), we will again make the recursive call to fib(n – 2) and fib(n – 3).
Similarly, for fib(n – 2), recursive calls are made on fib(n – 3) and fib(n – 4) until we
reach the base case.
 During each recursive call, we perform constant work(k) (adding previous outputs to
obtain the current output). We perform 2nK work at every level (where n = 0, 1, 2, …).
Since n is the number of calls needed to reach 1, we are performing 2n-1k at the final
level. Total work can be calculated as:
 If we draw the recursion tree of the Fibonacci recursion then we found the maximum
height of the tree will be n and hence the space complexity of the Fibonacci recursion
will be O(n).
Efficient approach: As it is a very terrible complexity(Exponential), thus we need to
optimize it with an efficient method. (Memoization)
Let’s look at the example below for finding the 5th Fibonacci number.

Representation of 5th Fibonacci number

Observations:
 The entire program repeats recursive calls. As in the above figure, for calculating
fib(4), we need the value of fib(3) (first recursive call over fib(3)), and for calculating
fib(5), we again need the value of fib(3)(second similar recursive call over fib(3)).
 Both of these recursive calls are shown above in the outlining circle.
 Similarly, there are many others for which we are repeating the recursive calls.
 Recursion generally involves repeated recursive calls, which increases the program’s
time complexity.
 By storing the output of previously encountered values (preferably in arrays, as these
can be traversed and extracted most efficiently), we can overcome this problem. The
next time we make a recursive call over these values, we will use their already stored
outputs instead of calculating them all over again.
 In this way, we can improve the performance of our code. Memoization is the process
of storing each recursive call’s output for later use, preventing the code from
calculating it again.
Way to memoize: To achieve this in our example we will simply take an answer array
initialized to -1. As we make a recursive call, we will first check if the value stored in the
answer array corresponding to that position is -1. The value -1 indicates that we haven’t
calculated it yet and have to recursively compute it. The output must be stored in the
answer array so that, next time, if the same value is encountered, it can be directly used
from the answer array.
Now in this process of memoization, considering the above Fibonacci numbers example, it
can be observed that the total number of unique calls will be at most (n + 1) only.
Below is the implementation for the above approach:

// C++ code for the above approach:


#include <iostream>
using namespace std;

// Helper Function
int fibo_helper(int n, int* ans)
{

// Base case
if (n <= 1) {
return n;
}

// To check if output already exists


if (ans[n] != -1) {
return ans[n];
}

// Calculate output
int x = fibo_helper(n - 1, ans);
int y = fibo_helper(n - 2, ans);

// Saving the output for future use


ans[n] = x + y;

// Returning the final output


return ans[n];
}

int fibo(int n)
{
int* ans = new int[n + 1];

// Initializing with -1
for (int i = 0; i <= n; i++) {
ans[i] = -1;
}
fibo_helper(n, ans);
}

// Drivers code
int main()
{
int n = 5;

// Function Call
cout << fibo(n);
return 0;
}

Output
5

Complexity analysis:
 Time complexity: O(n)
 Auxiliary Space: O(n)
Optimized approach: Following a bottom-up approach to reach the desired index. This
approach of converting recursion into iteration is known as Dynamic programming(DP).
Observations:
 Finally, what we do is recursively call each response index field and calculate its value
using previously saved outputs.
 Recursive calls terminate via the base case, which means we are already aware of the
answers which should be stored in the base case indexes.
 In the case of Fibonacci numbers, these indices are 0 and 1 as f(ib0) = 0 and f(ib1) =
1. So we can directly assign these two values into our answer array and then use them
to calculate f(ib2), which is f(ib1) + f(ib0), and so on for each subsequent index.
 This can easily be done iteratively by running a loop from i = (2 to n). Finally, we get
our answer at the 5th index of the array because we already know that the ith index
contains the answer to the ith value.
 Simply, we first try to find out the dependence of the current value on previous values
and then use them to calculate our new value. Now, we are looking for those values
which do not depend on other values, which means they are independent(base case
values, since these, are the smallest problems
which we are already aware of).
Below is the implementation for the above approach:
// C++ code for the above approach:
#include <iostream>
using namespace std;

// Function for calculating the nth


// Fibonacci number
int fibo(int n)
{
int* ans = new int[n + 1];

// Storing the independent values in the


// answer array
ans[0] = 0;
ans[1] = 1;

// Using the bottom-up approach


for (int i = 2; i <= n; i++) {
ans[i] = ans[i - 1] + ans[i - 2];
}

// Returning the final index


return ans[n];
}

// Drivers code
int main()
{
int n = 5;

// Function Call
cout << fibo(n);
return 0;
}

Output
5

Complexity analysis:
 Time complexity: O(n)
 Auxiliary Space: O(n)
Optimization of above method
 in above code we can see that the current state of any fibonacci number depend only on
prev two number
 so using this observation we can conclude that we did not need to store the whole table
of size n but instead of that we can only store the prev two values
 so this way we can optimize the space complexity in the above code O(n) to O(1)

// C++ code for the above approach:


#include <iostream>
using namespace std;

// Function for calculating the nth Fibonacci number


int fibo(int n)
{
int prevPrev, prev, curr;

// Storing the independent values


prevPrev = 0;
prev = 1;
curr = 1;

// Using the bottom-up approach


for (int i = 2; i <= n; i++) {
curr = prev + prevPrev;
prevPrev = prev;
prev = curr;
}

// Returning the final answer


return curr;
}

// Drivers code
int main()
{
int n = 5;

// Function Call
cout << fibo(n);
return 0;
}

Output
5

Greedy approach vs Dynamic programming


Feature Greedy method Dynamic programming

Feasibility In a greedy Algorithm, we make In Dynamic Programming we


Feature Greedy method Dynamic programming

whatever choice seems best at make decision at each step


the moment in the hope that it considering current problem and
will lead to global optimal solution to previously solved sub
solution. problem to calculate optimal
solution .

It is guaranteed that Dynamic


In Greedy Method, sometimes Programming will generate an
Optimality there is no such guarantee of optimal solution as it generally
getting Optimal Solution. considers all possible cases and
then choose the best.

A Dynamic programming is an
A greedy method follows the
algorithmic technique which is
problem solving heuristic of
Recursion usually based on a recurrent
making the locally optimal
formula that uses some previously
choice at each stage.
calculated states.

It requires Dynamic Programming


Memoization It is more efficient in terms of
table for Memoization and it
memory as it never look back or
increases it’s memory
revise previous choices
complexity.

Greedy methods are generally Dynamic Programming is


Time
faster. For example, Dijkstra’s generally slower. For
complexity
shortest path algorithm takes example, Bellman Ford
O(ELogV + VLogV) time. algorithm takes O(VE) time.

The greedy method computes its


Dynamic programming computes
solution by making its choices in
its solution bottom up or top
Fashion a serial forward fashion, never
down by synthesizing them from
looking back or revising
smaller optimal sub solutions.
previous choices.

Fractional knapsack .
Example 0/1 knapsack problem
Longest Common Sequence (LCS)
A subsequence of a given sequence is just the given sequence with some elements left out.

Given two sequences X and Y, we say that the sequence Z is a common sequence of X and Y
if Z is a subsequence of both X and Y.

In the longest common subsequence problem, we are given two sequences X = (x1 x2....xm)
and Y = (y1 y2 yn) and wish to find a maximum length common subsequence of X and Y.
LCS Problem can be solved using dynamic programming.

Characteristics of Longest Common Sequence

A brute-force approach we find all the subsequences of X and check each subsequence to see
if it is also a subsequence of Y, this approach requires exponential time making it impractical
for the long sequence.

Given a sequence X = (x1 x2.....xm) we define the ith prefix of X for i=0, 1, and 2...m as Xi=
(x1 x2.....xi). For example: if X = (A, B, C, B, C, A, B, C) then X 4= (A, B, C, B)

Optimal Substructure of an LCS: Let X = (x1 x2....xm) and Y = (y1 y2.....) yn) be the
sequences and let Z = (z1 z2......zk) be any LCS of X and Y.

o If xm = yn, then zk=x_m=yn and Zk-1 is an LCS of Xm-1and Yn-1


o If xm ≠ yn, then zk≠ xm implies that Z is an LCS of Xm-1and Y.
o If xm ≠ yn, then zk≠yn implies that Z is an LCS of X and Yn-1

Step 2: Recursive Solution: LCS has overlapping subproblems property because to find
LCS of X and Y, we may need to find the LCS of X m-1 and Yn-1. If xm ≠ yn, then we must
solve two subproblems finding an LCS of X and Yn-1.Whenever of these LCS's longer is an
LCS of x and y. But each of these subproblems has the subproblems of finding the LCS of
Xm-1 and Yn-1.

Let c [i,j] be the length of LCS of the sequence X iand Yj.If either i=0 and j =0, one of
the sequences has length 0, so the LCS has length 0. The optimal substructure of the LCS
problem given the recurrence formula

Step3: Computing the length of an LCS: let two sequences X = (x1 x2.....xm) and Y =
(y1 y2..... yn) as inputs. It stores the c [i,j] values in the table c [0......m,0..........n].Table b
[1..........m, 1..........n] is maintained which help us to construct an optimal solution. c [m, n]
contains the length of an LCS of X,Y.
The longest common subsequence problem is finding the longest sequence which
exists in both the given strings.

But before we understand the problem, let us understand what the term subsequence is −

Let us consider a sequence S = <s 1, s2, s3, s4, …,sn>. And another sequence Z = <z1, z2,
z3, …,zm> over S is called a subsequence of S, if and only if it can be derived from S deletion
of some elements. In simple words, a subsequence consists of consecutive elements that
make up a small part in a sequence.

Common Subsequence

Suppose, X and Y are two sequences over a finite set of elements. We can say that Z is a
common subsequence of X and Y, if Z is a subsequence of both X and Y.

Longest Common Subsequence

If a set of sequences are given, the longest common subsequence problem is to find a
common subsequence of all the sequences that is of maximal length.

Naive Method

Let X be a sequence of length m and Y a sequence of length n. Check for every subsequence
of X whether it is a subsequence of Y, and return the longest common subsequence found.

There are 2m subsequences of X. Testing sequences whether or not it is a subsequence


of Y takes O(n) time. Thus, the naive algorithm would take O(n2 m) time.

Longest Common Subsequence Algorithm

Let X=<x1,x2,x3....,xm> and Y=<y1,y2,y3....,ym> be the sequences. To compute the length of


an element the following algorithm is used.

Step 1 − Construct an empty adjacency table with the size, n × m, where n = size of
sequence X and m = size of sequence Y. The rows in the table represent the elements in
sequence X and columns represent the elements in sequence Y.

Step 2 − The zeroeth rows and columns must be filled with zeroes. And the remaining values
are filled in based on different cases, by maintaining a counter value.

 Case 1 − If the counter encounters common element in both X and Y sequences,


increment the counter by 1.
 Case 2 − If the counter does not encounter common elements in X and Y sequences at
T[i, j], find the maximum value between T[i-1, j] and T[i, j-1] to fill it in T[i, j].

Step 3 − Once the table is filled, backtrack from the last value in the table. Backtracking here
is done by tracing the path where the counter incremented first.

Step 4 − The longest common subseqence obtained by noting the elements in the traced path.

Pseudocode
In this procedure, table C[m, n] is computed in row major order and another table B[m,n] is
computed to construct optimal solution.

Algorithm: LCS-Length-Table-Formulation (X, Y)


m := length(X)
n := length(Y)
for i = 1 to m do
C[i, 0] := 0
for j = 1 to n do
C[0, j] := 0
for i = 1 to m do
for j = 1 to n do
if xi = yj
C[i, j] := C[i - 1, j - 1] + 1
B[i, j] := ‘D’
else
if C[i -1, j] ≥ C[i, j -1]
C[i, j] := C[i - 1, j] + 1
B[i, j] := ‘U’
else
C[i, j] := C[i, j - 1] + 1
B[i, j] := ‘L’
return C and B
Algorithm: Print-LCS (B, X, i, j)
if i=0 and j=0
return
if B[i, j] = ‘D’
Print-LCS(B, X, i-1, j-1)
Print(xi)
else if B[i, j] = ‘U’
Print-LCS(B, X, i-1, j)
else
Print-LCS(B, X, i, j-1)

This algorithm will print the longest common subsequence of X and Y.

Analysis

To populate the table, the outer for loop iterates m times and the inner for loop
iterates n times. Hence, the complexity of the algorithm is O(m,n), where m and n are the
length of two strings.

Example
In this example, we have two strings X=BACDB and Y=BDCB to find the longest common
subsequence.

Following the algorithm, we need to calculate two tables 1 and 2.

Given n = length of X, m = length of Y

X = BDCB, Y = BACDB

Constructing the LCS Tables

In the table below, the zeroeth rows and columns are filled with zeroes. Remianing values are
filled by incrementing and choosing the maximum values according to the algorithm.

Once the values are filled, the path is traced back from the last value in the table at T[4, 5].
From the traced path, the longest common subsequence is found by choosing the values
where the counter is first incremented.

In this example, the final count is 3 so the counter is incremented at 3 places, i.e., B, C, B.
Therefore, the longest common subsequence of sequences X and Y is BCB.

Implementation

Following is the final implementation to find the Longest Common Subsequence using
Dynamic Programming Approach −

CC++
#include <stdio.h>
#include <string.h>
int max(int a, int b);
int lcs(char* X, char* Y, int m, int n){
int L[m + 1][n + 1];
int i, j, index;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else
L[i][j] = max(L[i - 1][j], L[i][j - 1]);
}
}
index = L[m][n];
char LCS[index + 1];
LCS[index] = '\0';
i = m, j = n;
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
LCS[index - 1] = X[i - 1];
i--;
j--;
index--;
} else if (L[i - 1][j] > L[i][j - 1])
i--;
else
j--;
}
printf("LCS: %s\n", LCS);
return L[m][n];
}
int max(int a, int b){
return (a > b) ? a : b;
}
int main(){
char X[] = "ABSDHS";
char Y[] = "ABDHSP";
int m = strlen(X);
int n = strlen(Y);
printf("Length of LCS is %d\n", lcs(X, Y, m, n));
return 0;
}
Output
LCS: ABDHS
Length of LCS is 5

Applications

The longest common subsequence problem is a classic computer science problem, the basis
of data comparison programs such as the diff-utility, and has applications in bioinformatics.
It is also widely used by revision control systems, such as SVN and Git, for reconciling
multiple changes made to a revision-controlled collection of files.
Greedy Algorithm:
The greedy method is one of the strategies like Divide and conquer used to solve the
problems. This method is used for solving optimization problems. An optimization problem
is a problem that demands either maximum or minimum results. Let's understand through
some terms.

The Greedy method is the simplest and straightforward approach. It is not an


algorithm, but it is a technique. The main function of this approach is that the decision is
taken on the basis of the currently available information. Whatever the current information is
present, the decision is made without worrying about the effect of the current decision in
future.

This technique is basically used to determine the feasible solution that may or may not
be optimal. The feasible solution is a subset that satisfies the given criteria. The optimal
solution is the solution which is the best and the most favorable solution in the subset. In the
case of feasible, if more than one solution satisfies the given criteria then those solutions will
be considered as the feasible, whereas the optimal solution is the best solution among all the
solutions.

Characteristics of Greedy method

The following are the characteristics of a greedy method:

o To construct the solution in an optimal way, this algorithm creates two sets where one
set contains all the chosen items, and another set contains the rejected items.
o A Greedy algorithm makes good local choices in the hope that the solution should be
either feasible or optimal.

Components of Greedy Algorithm

The components that can be used in the greedy algorithm are:

o Candidate set: A solution that is created from the set is known as a candidate set.
o Selection function: This function is used to choose the candidate or subset which can
be added in the solution.
o Feasibility function: A function that is used to determine whether the candidate or
subset can be used to contribute to the solution or not.
o Objective function: A function is used to assign the value to the solution or the partial
solution.
o Solution function: This function is used to intimate whether the complete function has
been reached or not.

Applications of Greedy Algorithm


o It is used in finding the shortest path.
o It is used to find the minimum spanning tree using the prim's algorithm or the
Kruskal's algorithm.
o It is used in a job sequencing with a deadline.
o This algorithm is also used to solve the fractional knapsack problem.

Pseudo code of Greedy Algorithm

1. Algorithm Greedy (a, n)


2. {
3. Solution : = 0;
4. for i = 0 to n do
5. {
6. x: = select(a);
7. if feasible(solution, x)
8. {
9. Solution: = union(solution , x)
10. }
11. return solution;
12. } }

The above is the greedy algorithm. Initially, the solution is assigned with zero value.
We pass the array and number of elements in the greedy algorithm. Inside the for loop, we
select the element one by one and checks whether the solution is feasible or not. If the
solution is feasible, then we perform the union.

Let's understand through an example.

Suppose there is a problem 'P'. I want to travel from A to B shown as below:

P:A→B

The problem is that we have to travel this journey from A to B. There are various solutions to
go from A to B. We can go from A to B by walk, car, bike, train, aeroplane, etc. There is a
constraint in the journey that we have to travel this journey within 12 hrs. If I go by train or
aeroplane then only, I can cover this distance within 12 hrs. There are many solutions to this
problem but there are only two solutions that satisfy the constraint.

If we say that we have to cover the journey at the minimum cost. This means that we have to
travel this distance as minimum as possible, so this problem is known as a minimization
problem. Till now, we have two feasible solutions, i.e., one by train and another one by air.
Since travelling by train will lead to the minimum cost so it is an optimal solution. An
optimal solution is also the feasible solution, but providing the best result so that solution is
the optimal solution with the minimum cost. There would be only one optimal solution.
The problem that requires either minimum or maximum result then that problem is known as
an optimization problem. Greedy method is one of the strategies used for solving the
optimization problems.

Disadvantages of using Greedy algorithm

Greedy algorithm makes decisions based on the information available at each phase without
considering the broader problem. So, there might be a possibility that the greedy solution
does not give the best solution for every problem.

It follows the local optimum choice at each stage with a intend of finding the global
optimum. Let's understand through an example.

Consider the graph which is given below:

We have to travel from the source to the destination at the minimum cost. Since we
have three feasible solutions having cost paths as 10, 20, and 5. 5 is the minimum cost path
so it is the optimal solution. This is the local optimum, and in this way, we find the local
optimum at each stage in order to calculate the global optimal solution.

Greedy Algorithm is defined as a method for solving optimization problems by


taking decisions that result in the most evident and immediate benefit irrespective of the
final outcome. It works for cases where minimization or maximization leads to the
required solution.
Characteristics of Greedy algorithm
For a problem to be solved using the Greedy approach, it must follow a few major
characteristics:
 There is an ordered list of resources(profit, cost, value, etc.)
 Maximum of all the resources(max profit, max value, etc.) are taken.
 For example, in the fractional knapsack problem, the maximum value/weight is taken
first according to available capacity.
Use of the greedy algorithm:-
The greedy algorithm is a method used in optimization problems where the goal is to make
the locally optimal choice at each stage with the hope of finding a global optimum. It is
called “greedy” because it tries to find the best solution by making the best choice at each
step, without considering future steps or the consequences of the current decision.
Some common use cases for the greedy algorithm include:
 Scheduling and Resource Allocation: The greedy algorithm can be used to schedule
jobs or allocate resources in an efficient manner.
 Minimum Spanning Trees: The greedy algorithm can be used to find the minimum
spanning tree of a graph, which is the subgraph that connects all vertices with the
minimum total edge weight.
 Coin Change Problem: The greedy algorithm can be used to make change for a given
amount with the minimum number of coins, by always choosing the coin with the
highest value that is less than the remaining amount to be changed.
 Huffman Coding: The greedy algorithm can be used to generate a prefix-free code for
data compression, by constructing a binary tree in a way that the frequency of each
character is taken into consideration.
It’s important to note that not all optimization problems can be solved by a greedy
algorithm, and there are cases where the greedy approach can lead to suboptimal solutions.
However, in many cases, the greedy algorithm provides a good approximation to the
optimal solution and is a useful tool for solving optimization problems quickly and
efficiently.
All greedy algorithms follow a basic structure:
 Declare an empty result = 0.
 We make a greedy choice to select, If the choice is feasible add it to the final result.
 return the result.
Why choose Greedy Approach?
The greedy approach has a few tradeoffs, which may make it suitable for optimization.
One prominent reason is to achieve the most feasible solution immediately. In the activity
selection problem (Explained below), if more activities can be done before finishing the
current activity, these activities can be performed within the same time. Another reason is
to divide a problem recursively based on a condition, with no need to combine all the
solutions. In the activity selection problem, the “recursive division” step is achieved by
scanning a list of items only once and considering certain activities.
Greedy Algorithm Example:
Some Famous problems that exhibit Optimal substructure property and can be solved
using Greedy approach are –
1) Job sequencing Problem:
Greedily choose the jobs with maximum profit first, by sorting the jobs in decreasing order
of their profit. This would help to maximize the total profit as choosing the job with
maximum profit for every time slot will eventually maximize the total profit
2) Prim’s algorithm to find Minimum Spanning Tree:
It starts with an empty spanning tree. The idea is to maintain two sets of vertices. The first
set contains the vertices already included in the MST, the other set contains the vertices
not yet included. At every step, it considers all the edges that connect the two sets and
picks the minimum weight edge from these edges. After picking the edge, it moves the
other endpoint of the edge to the set containing MST.
How does the Greedy Algorithm works?
When the choice to apply the greedy method is made without conducting a thorough
examination, the decision to utilize it can be somewhat difficult and occasionally even
result in failure. In some cases taking the local best choice may lead to losing the global
optimal solution.
For example:
 One such example where the Greedy Approach fails is to find the Maximum weighted
path of nodes in the given graph.
Graph with weighted vertices

 In the above graph starting from the root node 10 if we greedily select the next node to
obtain the most weighted path the next selected node will be 5 that will take the total
sum to 15 and the path will end as there is no child of 5 but the path 10 -> 5 is not the
maximum weight path.

Greedy Approach fails

 In order to find the most weighted path all possible path sum must be computed and
their path sum must be compared to get the desired result, it is visible that the most
weighted path in the above graph is 10 -> 1 -> 30 that gives the path sum 41.
Correct Approach

 In such cases Greedy approach wouldn’t work instead complete paths from root to leaf
node has to be considered to get the correct answer i.e. the most weighted path, This
can be achieved by recursively checking all the paths and calculating their weight.
Thus to use Greedy algorithm the problem must not contain overlapping
subproblems.
Greedy Algorithm Vs Dynamic Programming
Greedy algorithm and Dynamic programming are two of the most widely used algorithm
paradigms for solving complex programming problems, While Greedy approach works for
problems where local optimal choice leads to global optimal solution Dynamic
Programming works for problems having overlapping subproblems structure where
answer to a subproblem is needed for solving several other subproblems. Detailed
differences are given in the table below:
Feature Greedy Algorithm Dynamic Programming

In Dynamic Programming
In a Greedy Algorithm, we we make decision at each
make whatever choice step considering current
Feasibility seems best at the moment problem and solution to
in the hope that it will lead previously solved sub
to global optimal solution. problem to calculate optimal
solution .

It is guaranteed that
In Greedy Method, Dynamic Programming will
sometimes there is no such generate an optimal solution
Optimality
guarantee of getting as it generally considers all
Optimal Solution. possible cases and then
choose the best.

Recursion A greedy method follows A Dynamic programming is


Feature Greedy Algorithm Dynamic Programming

the problem solving an algorithmic technique


heuristic of making the which is usually based on a
locally optimal choice at recurrent formula that uses
each stage. some previously calculated
states.

It is more efficient in It requires Dynamic


terms of memory as it Programming table for
Memoization
never look back or revise Memoization and it increases
previous choices it’s memory complexity.

Greedy methods are


generally faster. For Dynamic Programming is
example, Dijkstra’s generally slower. For
Time complexity
shortest path algorithm example, Bellman Ford
takes O(ELogV + algorithm takes O(VE) time.
VLogV) time.

The greedy method Dynamic programming


computes its solution by computes its solution bottom
making its choices in a up or top down by
Fashion
serial forward fashion, synthesizing them from
never looking back or smaller optimal sub
revising previous choices. solutions.

Fractional knapsack. 0/1 knapsack problem


Example

Applications of Greedy Algorithms:


 Finding an optimal solution (Activity selection, Fractional Knapsack, Job
Sequencing, Huffman Coding).
 Finding close to the optimal solution for NP-Hard problems like TSP.
 Greedy algorithm is used to select the jobs that will be completed before their
respective deadlines and maximizes the profit.
 Greedy algorithms are used to cluster data points together based on certain criteria,
such as distance or similarity.
 The problem is broken down into smaller subproblems that are solved independently,
but many of these subproblems are identical or overlapping.
Advantages of the Greedy Approach:
 The greedy approach is easy to implement.
 Typically have less time complexity.
 Greedy algorithms can be used for optimization purposes or finding close to
optimization in case of Hard problems.
 The greedy approach can be very efficient, as it does not require exploring all possible
solutions to the problem.
 The greedy approach can provide a clear and easy-to-understand solution to a problem,
as it follows a step-by-step process.
 The solutions to subproblems can be stored in a table, which can be reused for similar
problems.
Disadvantages of the Greedy Approach:
 The local optimal solution may not always be globally optimal.
 Lack of proof of optimality.
 The greedy approach is only applicable to problems that have the property of greedy-
choice property meaning not all problems can be solved using this approach.
 The greedy approach is not easily adaptable to changing problem conditions.
Here are some important points to keep in mind when working with greedy
algorithms:
1. Greedy algorithms make the locally optimal choice at each step, without considering
the consequences of that choice on future steps.
2. Greedy algorithms can be used to solve optimization problems that can be divided into
smaller subproblems.
3. Greedy algorithms may not always find the optimal solution. It is important to prove
the correctness of a greedy algorithm and to understand its limitations.
4. Greedy algorithms can be applied in many contexts, including scheduling, graph
theory, and dynamic programming.
5. When designing a greedy algorithm, it is important to identify the optimal substructure
and the greedy choice property.
6. The time complexity of a greedy algorithm depends on the specific problem and the
implementation of the algorithm.
7. Greedy algorithms can sometimes be used as a heuristic approach to solve problems
when the optimal solution is difficult to find in practice.
In some cases, a greedy algorithm may provide a solution that is close to the optimal
solution, but not necessarily the exact optimal solution. These solutions are known as
approximate solutions.
Elements of the Greedy Strategy:
Optimal Substructure:
An optimal solution to the problem contains within it optimal solutions to sub-
problems. A' = A - {1} (greedy choice) A' can be solved again with the greedy algorithm. S'
= { i � S, si � fi }
When do you use DP versus a greedy approach? Which should be faster?
The 0 - 1 knapsack problem:
A thief has a knapsack that holds at most W pounds. Item i : ( vi, wi ) ( v = value, w =
weight ) thief must choose items to maximize the value stolen and still fit into the knapsack.
Each item must be taken or left ( 0 - 1 ).
Fractional knapsack problem:
takes parts, as well as wholes
Both the 0 - 1 and fractional problems have the optimal substructure property:
Fractional: vi / wi is the value per pound. Clearly you take as much of the item with the
greatest value per pound. This continues until you fill the knapsack. Optimal (Greedy)
algorithm takes O ( n lg n ), as we must sort on vi / wi = di.
Consider the same strategy for the 0 - 1 problem:
W = 50 lbs. (maximum knapsack capacity)
w1 = 10 v1 = 60 d1.= 6
w2 = 20 v2 = 100 d2.= 5
w3 = 30 v3 = 120 d3 = 4
were d is the value density
Greedy approach: Take all of 1, and all of 2: v1+ v2 = 160, optimal solution is to take
all of 2 and 3: v2 + v3= 220, other solution is to take all of 1 and 3 v1+ v3 = 180. All below 50
lbs.
When solving the 0 - 1 knapsack problem, empty space lowers the effective d of the
load. Thus each time an item is chosen for inclusion we must consider both
i included
i excluded
These are clearly overlapping sub-problems for different i's and so best solved by DP!

An Activity Selection Problem:


The activity selection problem is a mathematical optimization problem. Our first
illustration is the problem of scheduling a resource among several challenge activities. We
find a greedy algorithm provides a well designed and simple method for selecting a
maximum- size set of manually compatible activities.

Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources
which can be used by only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each
Activity "i" has start time si and a finish time fi, where si ≤fi. If selected activity "i" take
place meanwhile the half-open time interval [si,fi). Activities i and j are compatible if the
intervals (si, fi) and [si, fi) do not overlap (i.e. i and j are compatible if si ≥fi or si ≥fi). The
activity-selection problem chosen the maximum- size set of mutually consistent activities.

Algorithm Of Greedy- Activity Selector:

GREEDY- ACTIVITY SELECTOR (s, f)


1. n ← length [s]
2. A ← {1}
3. j ← 1.
4. for i ← 2 to n
5. do if si ≥ fi
6. then A ← A ∪ {i}
7. j ← i
8. return A

Example: Given 10 activities along with their start and end time as

S = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)
Si = (1,2,3,4,7,8,9,9,11,12)

fi = (3,5,4,7,10,9,11,13,12,14)

Compute a schedule where the greatest number of activities takes place.

Solution: The solution to the above Activity scheduling problem using a greedy strategy is
illustrated below:

Arranging the activities in increasing order of end time

Now, schedule A1

Next schedule A3 as A1 and A3 are non-interfering.

Next skip A2 as it is interfering.


Next, schedule A4 as A1 A3 and A4 are non-interfering, then next, schedule A6 as
A1 A3 A4 and A6 are non-interfering.

Skip A5 as it is interfering.

Next, schedule A7 as A1 A3 A4 A6 and A7 are non-interfering.

Next, schedule A9 as A1 A3 A4 A6 A7 and A9 are non-interfering.

Skip A8 as it is interfering.

Next, schedule A10 as A1 A3 A4 A6 A7 A9 and A10 are non-interfering.

Thus the final Activity schedule is:

Huffman coding :

Huffman coding is a lossless data compression algorithm. The idea is to assign


variable-length codes to input characters, lengths of the assigned codes are based on the
frequencies of corresponding characters
The variable-length codes assigned to input characters are Prefix Codes, means the
codes (bit sequences) are assigned in such a way that the code assigned to one character is
not the prefix of code assigned to any other character.
This is how Huffman Coding makes sure that there is no ambiguity when
decoding the generated bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a,
b, c and d, and their corresponding variable length codes be 00, 01, 0 and 1. This coding
leads to ambiguity because code assigned to c is the prefix of codes assigned to a and b. If
the compressed bit stream is 0001, the de-compressed output may be “cccd” or “ccb” or
“acd” or “ab”.
See this for applications of Huffman Coding.
There are mainly two major parts in Huffman Coding
1. Build a Huffman Tree from input characters.
2. Traverse the Huffman Tree and assign codes to characters.

Algorithm:

The method which is used to construct optimal prefix code is called Huffman coding.
This algorithm builds a tree in bottom up manner. We can denote this tree by T
Let, |c| be number of leaves
|c| -1 are number of operations required to merge the nodes. Q be the priority queue which
can be used while constructing binary heap.
Algorithm Huffman (c)
{
n= |c|

Q=c
for i<-1 to n-1

do
{

temp <- get node ()

left (temp] Get_min (Q) right [temp] Get Min (Q)

a = left [templ b = right [temp]

F [temp]<- f[a] + [b]

insert (Q, temp)

return Get_min (0)


}
Steps to build Huffman Tree
Input is an array of unique characters along with their frequency of occurrences and output
is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes
(Min Heap is used as a priority queue. The value of frequency field is used to compare
two nodes in min heap. Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.

3. Create a new internal node with a frequency equal to the sum of the two nodes
frequencies. Make the first extracted node as its left child and the other extracted node
as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the
root node and the tree is complete.
Let us understand the algorithm with an example:
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree
with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node
with frequency 5 + 9 = 14.

Illustration of step 2

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each,
and one heap node is root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with
frequency 12 + 13 = 25

Illustration of step 3

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each,
and two heap nodes are root of tree with more than one nodes
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency
14 + 16 = 30

Illustration of step 4

Now min heap contains 3 nodes.


character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency
25 + 30 = 55

Illustration of step 5

Now min heap contains 2 nodes.


character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency
45 + 55 = 100

Illustration of step 6

Now min heap contains only one node.


character Frequency
Internal Node 100
Since the heap contains only one node, the algorithm stops here.
Steps to print codes from Huffman Tree:
Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving
to the left child, write 0 to the array. While moving to the right child, write 1 to the array.
Print the array when a leaf node is encountered.

Steps to print code from HuffmanTree


The codes are as follows:
character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111
Recommended Problem
Huffman Encoding

Below is the implementation of above approach:

// C program for Huffman Coding


#include <stdio.h>
#include <stdlib.h>

// This constant can be avoided by explicitly


// calculating height of Huffman Tree
#define MAX_TREE_HT 100

// A Huffman tree node


struct MinHeapNode {

// One of the input characters


char data;

// Frequency of the character


unsigned freq;

// Left and right child of this node


struct MinHeapNode *left, *right;
};

// A Min Heap: Collection of


// min-heap (or Huffman tree) nodes
struct MinHeap {

// Current size of min heap


unsigned size;

// capacity of min heap


unsigned capacity;
// Array of minheap node pointers
struct MinHeapNode** array;
};

// A utility function allocate a new


// min heap node with given character
// and frequency of the character
struct MinHeapNode* newNode(char data, unsigned freq)
{
struct MinHeapNode* temp = (struct MinHeapNode*)malloc(
sizeof(struct MinHeapNode));

temp->left = temp->right = NULL;


temp->data = data;
temp->freq = freq;

return temp;
}

// A utility function to create


// a min heap of given capacity
struct MinHeap* createMinHeap(unsigned capacity)

struct MinHeap* minHeap


= (struct MinHeap*)malloc(sizeof(struct MinHeap));

// current size is 0
minHeap->size = 0;

minHeap->capacity = capacity;

minHeap->array = (struct MinHeapNode**)malloc(


minHeap->capacity * sizeof(struct MinHeapNode*));
return minHeap;
}

// A utility function to
// swap two min heap nodes
void swapMinHeapNode(struct MinHeapNode** a,
struct MinHeapNode** b)

struct MinHeapNode* t = *a;


*a = *b;
*b = t;
}

// The standard minHeapify function.


void minHeapify(struct MinHeap* minHeap, int idx)

int smallest = idx;


int left = 2 * idx + 1;
int right = 2 * idx + 2;

if (left < minHeap->size


&& minHeap->array[left]->freq
< minHeap->array[smallest]->freq)
smallest = left;

if (right < minHeap->size


&& minHeap->array[right]->freq
< minHeap->array[smallest]->freq)
smallest = right;

if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}

// A utility function to check


// if size of heap is 1 or not
int isSizeOne(struct MinHeap* minHeap)
{

return (minHeap->size == 1);


}

// A standard function to extract


// minimum value node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)

struct MinHeapNode* temp = minHeap->array[0];


minHeap->array[0] = minHeap->array[minHeap->size - 1];

--minHeap->size;
minHeapify(minHeap, 0);

return temp;
}

// A utility function to insert


// a new node to Min Heap
void insertMinHeap(struct MinHeap* minHeap,
struct MinHeapNode* minHeapNode)

++minHeap->size;
int i = minHeap->size - 1;

while (i
&& minHeapNode->freq
< minHeap->array[(i - 1) / 2]->freq) {

minHeap->array[i] = minHeap->array[(i - 1) / 2];


i = (i - 1) / 2;
}

minHeap->array[i] = minHeapNode;
}

// A standard function to build min heap


void buildMinHeap(struct MinHeap* minHeap)

int n = minHeap->size - 1;
int i;

for (i = (n - 1) / 2; i >= 0; --i)


minHeapify(minHeap, i);
}

// A utility function to print an array of size n


void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);

printf("\n");
}
// Utility function to check if this node is leaf
int isLeaf(struct MinHeapNode* root)

return !(root->left) && !(root->right);


}

// Creates a min heap of capacity


// equal to size and inserts all character of
// data[] in min heap. Initially size of
// min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(char data[],
int freq[], int size)

struct MinHeap* minHeap = createMinHeap(size);

for (int i = 0; i < size; ++i)


minHeap->array[i] = newNode(data[i], freq[i]);

minHeap->size = size;
buildMinHeap(minHeap);

return minHeap;
}

// The main function that builds Huffman tree


struct MinHeapNode* buildHuffmanTree(char data[],
int freq[], int size)

{
struct MinHeapNode *left, *right, *top;

// Step 1: Create a min heap of capacity


// equal to size. Initially, there are
// modes equal to size.
struct MinHeap* minHeap
= createAndBuildMinHeap(data, freq, size);

// Iterate while size of heap doesn't become 1


while (!isSizeOne(minHeap)) {

// Step 2: Extract the two minimum


// freq items from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);

// Step 3: Create a new internal


// node with frequency equal to the
// sum of the two nodes frequencies.
// Make the two extracted node as
// left and right children of this new node.
// Add this node to the min heap
// '$' is a special value for internal nodes, not
// used
top = newNode('$', left->freq + right->freq);

top->left = left;
top->right = right;

insertMinHeap(minHeap, top);
}

// Step 4: The remaining node is the


// root node and the tree is complete.
return extractMin(minHeap);
}

// Prints huffman codes from the root of Huffman Tree.


// It uses arr[] to store codes
void printCodes(struct MinHeapNode* root, int arr[],
int top)

// Assign 0 to left edge and recur


if (root->left) {

arr[top] = 0;
printCodes(root->left, arr, top + 1);
}

// Assign 1 to right edge and recur


if (root->right) {

arr[top] = 1;
printCodes(root->right, arr, top + 1);
}

// If this is a leaf node, then


// it contains one of the input
// characters, print the character
// and its code from arr[]
if (isLeaf(root)) {

printf("%c: ", root->data);


printArr(arr, top);
}
}

// The main function that builds a


// Huffman Tree and print codes by traversing
// the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)

{
// Construct Huffman Tree
struct MinHeapNode* root
= buildHuffmanTree(data, freq, size);

// Print Huffman codes using


// the Huffman tree built above
int arr[MAX_TREE_HT], top = 0;

printCodes(root, arr, top);


}

// Driver code
int main()
{

char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };


int freq[] = { 5, 9, 12, 13, 16, 45 };

int size = sizeof(arr) / sizeof(arr[0]);

HuffmanCodes(arr, freq, size);

return 0;
}

Output
f: 0
c: 100
d: 101
a: 1100
b: 1101
e: 111
Time complexity: O(nlogn) where n is the number of unique characters. If there are
n nodes, extractMin() is called 2*(n – 1) times. extractMin() takes O(logn) time as it calls
minHeapify(). So, the overall complexity is O(nlogn).
If the input array is sorted, there exists a linear time algorithm. We will soon be discussing
this in our next post.
Space complexity :- O(N)
Applications of Huffman Coding:
1. They are used for transmitting fax and text.
2. They are used by conventional compression formats like PKZIP, GZIP, etc.
3. Multimedia codecs like JPEG, PNG, and MP3 use Huffman encoding(to be more
precise the prefix codes).

UNIT-V NP Complete and NP Hard

NP-Completeness:
A decision problem L is NP-Hard if

L' ≤p L for all L' ϵ NP.

Definition: L is NP-complete if

1. L ϵ NP and
2. L' ≤ p L for some known NP-complete problem L.' Given this formal definition, the
complexity classes are:

P: is the set of decision problems that are solvable in polynomial time.

NP: is the set of decision problems that can be verified in polynomial time.

NP-Hard: L is NP-hard if for all L' ϵ NP, L' ≤p L. Thus if we can solve L in polynomial
time, we can solve all NP problems in polynomial time.

NP-Complete L is NP-complete if

1. L ϵ NP and
2. L is NP-hard
If any NP-complete problem is solvable in polynomial time, then every NP-Complete
problem is also solvable in polynomial time. Conversely, if we can prove that any NP-
Complete problem cannot be solved in polynomial time, every NP-Complete problem cannot
be solvable in polynomial time.

Reductions

Concept: - If the solution of NPC problem does not exist then the conversion from one NPC
problem to another NPC problem within the polynomial time. For this, you need the concept
of reduction. If a solution of the one NPC problem exists within the polynomial time, then
the rest of the problem can also give the solution in polynomial time (but it's hard to believe).
For this, you need the concept of reduction.

Example: - Suppose there are two problems, A and B. You know that it is impossible to
solve problem A in polynomial time. You want to prove that B cannot be solved in
polynomial time. So you can convert the problem A into problem B in polynomial time.

Example of NP-Complete problem

NP problem: - Suppose a DECISION-BASED problem is provided in which a set of


inputs/high inputs you can get high output.

Criteria to come either in NP-hard or NP-complete.

1. The point to be noted here, the output is already given, and you can verify the
output/solution within the polynomial time but can't produce an output/solution in
polynomial time.
2. Here we need the concept of reduction because when you can't produce an output of
the problem according to the given input then in case you have to use an emphasis on
the concept of reduction in which you can convert one problem into another problem.

Note1:- If you satisfy both points then your problem comes into the category of NP-complete
class

Note2:- If you satisfy the only 2nd points then your problem comes into the category of NP-
hard class

So according to the given decision-based NP problem, you can decide in the form of yes or
no. If, yes then you have to do verify and convert into another problem via reduction concept.
If you are being performed, both then decision-based NP problems are in NP compete.

Here we will emphasize NPC.


NP-complete problems are a subset of the larger class of NP (nondeterministic
polynomial time) problems. NP problems are a class of computational problems
that can be solved in polynomial time by a non-deterministic machine and can be
verified in polynomial time by a deterministic Machine. A problem L in NP is NP-
complete if all other problems in NP can be reduced to L in polynomial time. If
any NP-complete problem can be solved in polynomial time, then every problem
in NP can be solved in polynomial time. NP-complete problems are the hardest
problems in the NP set.
A decision problem L is NP-complete if it follow the below two properties:
1. L is in NP (Any solution to NP-complete problems can be checked quickly, but
no efficient solution is known).
2. Every problem in NP is reducible to L in polynomial time (Reduction is defined
below).
A problem is NP-Hard if it obeys Property 2 above and need not obey Property 1.
Therefore, a problem is NP-complete if it is both NP and NP-hard.
NP-Complete Complexity Classes

Decision vs Optimization Problems


NP-completeness applies to the realm of decision problems. It was set up this
way because it’s easier to compare the difficulty of decision problems than that of
optimization problems. In reality, though, being able to solve a decision problem
in polynomial time will often permit us to solve the corresponding optimization
problem in polynomial time (using a polynomial number of calls to the decision
problem). So, discussing the difficulty of decision problems is often really
equivalent to discussing the difficulty of optimization problems.
For example, consider the vertex cover problem (Given a graph, find out the
minimum sized vertex set that covers all edges). It is an optimization problem. The
corresponding decision problem is, given undirected graph G and k, is there a
vertex cover of size k?
What is Reduction?
Let L1 and L2 be two decision problems. Suppose algorithm A2 solves L2. That
is, if y is an input for L2 then algorithm A2 will answer Yes or No depending upon
whether y belongs to L2 or not.
The idea is to find a transformation from L1 to L2 so that algorithm A2 can be part
of algorithm A1 to solve L1.
Learning reduction, in general, is very important. For example, if we have
library functions to solve certain problems and if we can reduce a new problem to
one of the solved problems, we save a lot of time. Consider the example of a
problem where we have to find the minimum product path in a given directed
graph where the product of the path is the multiplication of weights of edges along
the path. If we have code for Dijkstra’s algorithm to find the shortest path, we can
take the log of all weights and use Dijkstra’s algorithm to find the minimum product
path rather than writing a fresh code for this new problem.
How to prove that a given problem is NP-complete?
From the definition of NP-complete, it appears impossible to prove that a
problem L is NP-Complete. By definition, it requires us to that show every
problem in NP is polynomial time reducible to L. Fortunately, there is an alternate
way to prove it. The idea is to take a known NP-Complete problem and reduce it
to L. If a polynomial-time reduction is possible, we can prove that L is NP-
Complete by transitivity of reduction (If an NP-Complete problem is reducible to L
in polynomial time, then all problems are reducible to L in polynomial time).
What was the first problem proved as NP-Complete?
There must be some first NP-Complete problem proved by the definition of
NP-Complete problems. SAT (Boolean satisfiability problem) is the first NP-
Complete problem proved by Cook (See CLRS book for proof).
It is always useful to know about NP-Completeness even for engineers. Suppose
you are asked to write an efficient algorithm to solve an extremely important
problem for your company. After a lot of thinking, you can only come up
exponential time approach which is impractical. If you don’t know about NP-
Completeness, you can only say that I could not come up with an efficient
algorithm. If you know about NP-Completeness and prove that the problem is NP-
complete, you can proudly say that the polynomial-time solution is unlikely to exist.
If there is a polynomial-time solution possible, then that solution solves a big
problem of computer science many scientists have been trying for years.
Polynomial Time Verification:
Before talking about the class of NP-complete problems, it is essential to introduce the
notion of a verification algorithm.

Many problems are hard to solve, but they have the property that it easy to authenticate the
solution if one is provided.

Hamiltonian cycle problem:-

Consider the Hamiltonian cycle problem. Given an undirected graph G, does G have a cycle
that visits each vertex exactly once? There is no known polynomial time algorithm for this
dispute.

Note: - It means you can't build a Hamiltonian cycle in a graph with a polynomial time even
if there is no specific path is given for the Hamiltonian cycle with the particular vertex, yet
you can't verify the Hamiltonian cycle within the polynomial time
Fig: Hamiltonian Cycle

Let us understand that a graph did have a Hamiltonian cycle. It would be easy for
someone to convince of this. They would similarly say: "the period is hv3, v7, v1....v13i.

We could then inspect the graph and check that this is indeed a legal cycle and that it
visits all of the vertices of the graph exactly once. Thus, even though we know of no efficient
way to solve the Hamiltonian cycle problem, there is a beneficial way to verify that a given
cycle is indeed a Hamiltonian cycle.

Note:-For the verification in the Polynomial-time of an undirected Hamiltonian cycle graph


G. There must be exact/specific/definite path must be given of Hamiltonian cycle then you
can verify in the polynomial time.

Definition of Certificate: - A piece of information which contains in the given path of a


vertex is known as certificate

Relation of P and NP classes

1. P contains in NP
2. P=NP

1. Observe that P contains in NP. In other words, if we can solve a problem in


polynomial time, we can indeed verify the solution in polynomial time. More formally,
we do not need to see a certificate (there is no need to specify the vertex/intermediate
of the specific path) to solve the problem; we can explain it in polynomial time
anyway.
2. However, it is not known whether P = NP. It seems you can verify and produce an
output of the set of decision-based problems in NP classes in a polynomial time which
is impossible because according to the definition of NP classes you can verify the
solution within the polynomial time. So this relation can never be held.

Reductions:

The class NP-complete (NPC) problems consist of a set of decision problems (a subset
of class NP) that no one knows how to solve efficiently. But if there were a polynomial
solution for even a single NP-complete problem, then every problem in NPC will be solvable
in polynomial time. For this, we need the concept of reductions.

Suppose there are two problems, A and B. You know that it is impossible to solve
problem A in polynomial time. You want to prove that B cannot be explained in polynomial
time. We want to show that (A ∉ P) => (B ∉ P)

Consider an example to illustrate reduction: The following problem is well-known to be


NPC:
3-color: Given a graph G, can each of its vertices be labeled with one of 3 different colors
such that two adjacent vertices do not have the same label (color).

Coloring arises in various partitioning issues where there is a constraint that two
objects cannot be assigned to the same set of partitions. The phrase "coloring" comes from
the original application which was in map drawing. Two countries that contribute a common
border should be colored with different colors.

It is well known that planar graphs can be colored (maps) with four colors. There
exists a polynomial time algorithm for this. But deciding whether this can be done with 3
colors is hard, and there is no polynomial time algorithm for it.

Fig: Example of 3-colorable and non-3-colorable graphs.

Polynomial Time Reduction:

We say that Decision Problem L1 is Polynomial time Reducible to decision Problem


L2 (L1≤p L2) if there is a polynomial time computation function f such that of all x, xϵL 1 if and
only if xϵL2.

NP completeness proof:
Figure 28.12.1: We will use this sequence of reductions for the NP Complete Proof

NP complete Problems:
The NP problems set of problems whose solutions are hard to find but easy to verify and
are solved by Non-Deterministic Machine in polynomial time.
NP-Hard Problem:
A Problem X is NP-Hard if there is an NP-Complete problem Y, such that Y is reducible
to X in polynomial time. NP-Hard problems are as hard as NP-Complete problems. NP-
Hard Problem need not be in NP class.
If every problem of NP can be polynomial time reduced to it called as NP Hard.
A lot of times takes the particular problem solve and reducing different problems.
example :
1. Hamiltonian cycle .
2. optimization problem .
3. Shortest path
NP-Complete Problem:
A problem X is NP-Complete if there is an NP problem Y, such that Y is reducible to X in
polynomial time. NP-Complete problems are as hard as NP problems. A problem is NP-
Complete if it is a part of both NP and NP-Hard Problem. A non-deterministic Turing
machine can solve NP-Complete problem in polynomial time.
A problem is np-complete when it is both np and np hard combines together.
this means np complete problems can be verified in polynomial time.
Example:
1. Decision problems.
2. Regular graphs.
Difference between NP-Hard and NP-Complete:
NP-hard NP-Complete

NP-Hard problems(say X) can be solved


NP-Complete problems can be solved by a non-
if and only if there is a NP-Complete
deterministic Algorithm/Turing Machine in
problem(say Y) that can be reducible into
polynomial time.
X in polynomial time.

To solve this problem, it do not have to To solve this problem, it must be both NP and
be in NP . NP-hard problems.

Time is unknown in NP-Hard. Time is known as it is fixed in NP-Hard.

NP-Complete is exclusively a decision


NP-hard is not a decision problem.
problem.

Not all NP-hard problems are NP-


All NP-complete problems are NP-hard
complete.

Do not have to be a Decision problem. It is exclusively a Decision problem.

It is optimization problem used. It is Decision problem used.

Example: Determine whether a graph has a


Example: Halting problem, Vertex cover Hamiltonian cycle, Determine whether a
problem, etc. Boolean formula is satisfiable or not, Circuit-
satisfiability problem, etc.

You might also like