UNIT-IV & V Advanced Data Structures and Algorithms
UNIT-IV & V Advanced Data Structures and Algorithms
Dynamic Programming:
Dynamic programming is a technique that breaks the problems into sub-problems, and
saves the result for future purposes so that we do not need to compute the result again. The
subproblems are optimized to optimize the overall solution is known as optimal substructure
property. The main use of dynamic programming is to solve optimization problems. Here,
optimization problems mean that when we are trying to find out the minimum or the
maximum solution of a problem. The dynamic programming guarantees to find the optimal
solution of a problem if the solution exists.
Consider an example of the Fibonacci series. The following series is the Fibonacci
series:
The numbers in the above series are not randomly calculated. Mathematically, we
could write each of the terms using the below formula:
With the base values F(0) = 0, and F(1) = 1. To calculate the other numbers, we follow
the above relationship. For example, F(2) is the sum f(0) and f(1), which is equal to 1.
The F(20) term will be calculated using the nth formula of the Fibonacci series. The below
figure shows that how F(20) is calculated.
As we can observe in the above figure that F(20) is calculated as the sum of F(19) and
F(18). In the dynamic programming approach, we try to divide the problem into the similar
subproblems. We are following this approach in the above case where F(20) into the similar
subproblems, i.e., F(19) and F(18). If we recap the definition of dynamic programming that it
says the similar subproblem should not be computed more than once. Still, in the above case,
the subproblem is calculated twice. In the above example, F(18) is calculated two times;
similarly, F(17) is also calculated twice. However, this technique is quite useful as it solves
the similar subproblems, but we need to be cautious while storing the results because we are
not particular about storing the result that we have computed once, then it can lead to a
wastage of resources.
In the above example, if we calculate the F(18) in the right subtree, then it leads to the
tremendous usage of resources and decreases the overall performance.
The solution to the above problem is to save the computed results in an array. First, we
calculate F(16) and F(17) and save their values in an array. The F(18) is calculated by
summing the values of F(17) and F(16), which are already saved in an array. The computed
value of F(18) is saved in an array. The value of F(19) is calculated using the sum of F(18),
and F(17), and their values are already saved in an array. The computed value of F(19) is
stored in an array. The value of F(20) can be calculated by adding the values of F(19) and
F(18), and the values of both F(19) and F(18) are stored in an array. The final computed
value of F(20) is stored in an array.
The following are the steps that the dynamic programming follows:
The above five steps are the basic steps for dynamic programming. The dynamic
programming is applicable that are having properties such as:
Those problems that are having overlapping subproblems and optimal substructures. Here,
optimal substructure means that the solution of optimization problems can be obtained by
simply combining the optimal solution of all the subproblems.
In the case of dynamic programming, the space complexity would be increased as we are
storing the intermediate results, but the time complexity would be decreased.
Top-down approach
The top-down approach follows the memorization technique, while bottom-up approach
follows the tabulation method. Here memorization is equal to the sum of recursion and
caching. Recursion means calling the function itself, while caching means storing the
intermediate results.
Advantages
Disadvantages
It uses the recursion technique that occupies more memory in the call stack. Sometimes when
the recursion is too deep, the stack overflow condition will occur.
1. int fib(int n)
2. {
3. if(n<0)
4. error;
5. if(n==0)
6. return 0;
7. if(n==1)
8. return 1;
9. sum = fib(n-1) + fib(n-2);
10. }
In the above code, we have used the recursive approach to find out the Fibonacci series.
When the value of 'n' increases, the function calls will also increase, and computations will
also increase. In this case, the time complexity increases exponentially, and it becomes 2 n.
One solution to this problem is to use the dynamic programming approach. Rather than
generating the recursive tree again and again, we can reuse the previously calculated value. If
we use the dynamic programming approach, then the time complexity would be O(n).
When we apply the dynamic programming approach in the implementation of the Fibonacci
series, then the code would look like:
1. static int count = 0;
2. int fib(int n)
3. {
4. if(memo[n]!= NULL)
5. return memo[n];
6. count++;
7. if(n<0)
8. error;
9. if(n==0)
10. return 0;
11. if(n==1)
12. return 1;
13. sum = fib(n-1) + fib(n-2);
14. memo[n] = sum;
15. }
In the above code, we have used the memorization technique in which we store the results in
an array to reuse the values. This is also known as a top-down approach in which we move
from the top and break the problem into sub-problems.
Bottom-Up approach
The bottom-up approach is also one of the techniques which can be used to implement the
dynamic programming. It uses the tabulation technique to implement the dynamic
programming approach. It solves the same kind of problems but it removes the recursion. If
we remove the recursion, there is no stack overflow issue and no overhead of the recursive
functions. In this tabulation technique, we solve the problems and store the results in a
matrix.
o Top-Down
o Bottom-Up
The bottom-up is the approach used to avoid the recursion, thus saving the memory space.
The bottom-up is an algorithm that starts from the beginning, whereas the recursive
algorithm starts from the end and works backward. In the bottom-up approach, we start from
the base case to find the answer for the end. As we know, the base cases in the Fibonacci
series are 0 and 1. Since the bottom approach starts from the base cases, so we will start from
0 and 1.
Key points
o We solve all the smaller sub-problems that will be needed to solve the larger sub-
problems then move to the larger problems using smaller sub-problems.
o We use for loop to iterate over the sub-problems.
o The bottom-up approach is also known as the tabulation or table filling method.
Suppose we have an array that has 0 and 1 values at a[0] and a[1] positions, respectively
shown as below:
Since the bottom-up approach starts from the lower values, so the values at a[0] and a[1] are
added to find the value of a[2] shown as below:
The value of a[3] will be calculated by adding a[1] and a[2], and it becomes 2 shown as
below:
The value of a[4] will be calculated by adding a[2] and a[3], and it becomes 3 shown as
below:
The value of a[5] will be calculated by adding the values of a[4] and a[3], and it becomes 5
shown as below:
The code for implementing the Fibonacci series using the bottom-up approach is given
below:
1. int fib(int n)
2. {
3. int A[];
4. A[0] = 0, A[1] = 1;
5. for( i=2; i<=n; i++)
6. {
7. A[i] = A[i-1] + A[i-2]
8. }
9. return A[n];
10. }
In the above code, base cases are 0 and 1 and then we have used for loop to find other values
of Fibonacci series.
Initially, the first two values, i.e., 0 and 1 can be represented as:
When i=2 then the values 0 and 1 are added shown as below:
When i=3 then the values 1and 1 are added shown as below:
When i=4 then the values 2 and 1 are added shown as below:
When i=5, then the values 3 and 2 are added shown as below:
In the above case, we are starting from the bottom and reaching to the top.
1. n length[p]-1
2. for i ← 1 to n
3. do m [i, i] ← 0
4. for l ← 2 to n // l is the chain length
5. do for i ← 1 to n-l + 1
6. do j ← i+ l -1
7. m[i,j] ← ∞
8. for k ← i to j-1
9. do q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
10. If q < m [i,j]
11. then m [i,j] ← q
12. s [i,j] ← k
13. return m and s.
Analysis: There are three nested loops. Each loop executes a maximum n times.
Question: P [7, 1, 5, 4, 2}
Let say,
From P = {7, 1, 5, 4, 2} - (Given)
And P is the Position
p0 = 7, p1 =1, p2 = 5, p3 = 4, p4=2.
Length of array P = number of elements in P
∴length (p)= 5
From step 3
Follow the steps in Algorithm in Sequence
According to Step 1 of Algorithm Matrix-Chain-Order
Step 1:
n ← length [p]-1
Where n is the total number of elements
And length [p] = 5
∴n=5-1=4
n=4
Now we construct two tables m and s.
Table m has dimension [1.....n, 1.......n]
Table s has dimension [1.....n-1, 2.......n]
1. for i ← 1 to n
2. this means: for i ← 1 to 4 (because n =4)
3. for i=1
4. m [i, i]=0
5. m [1, 1]=0
6. Similarly for i = 2, 3, 4
7. m [2, 2] = m [3,3] = m [4,4] = 0
8. i.e. fill all the diagonal entries "0" in the table m
9. Now,
10. l ← 2 to n
11. l ← 2 to 4 (because n =4 )
Case 1:
1. When l - 2
for (i ← 1 to n - l + 1)
i ← 1 to 4 - 2 + 1
i ← 1 to 3
When i = 1
do j ← i + l - 1
j←1+2-1
j←2
i.e. j = 2
Now, m [i, j] ← ∞
i.e. m [1,2] ← ∞
Put ∞ in m [1, 2] table
for k ← i to j-1
k ← 1 to 2 - 1
k ← 1 to 1
k=1
Now q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
for l = 2
i=1
j =2
k=1
q ← m [1,1] + m [2,2] + p0x p1x p2
and m [1,1] = 0
for i ← 1 to 4
∴q←0+0+7x1x5
q ← 35
We have m [i, j] = m [1, 2] = ∞
Comparing q with m [1, 2]
q < m [i, j]
i.e. 35 < m [1, 2]
35 < ∞
True
then, m [1, 2 ] ← 35 (∴ m [i,j] ← q)
s [1, 2] ← k
and the value of k = 1
s [1,2 ] ← 1
Insert "1" at dimension s [1, 2] in table s. And 35 at m [1, 2]
2. l remains 2
L=2
i ← 1 to n - l + 1
i ← 1 to 4 - 2 + 1
i ← 1 to 3
for i = 1 done before
Now value of i becomes 2
i=2
j← i+l-1
j← 2+2-1
j← 3
j=3
m [i , j] ← ∞
i.e. m [2,3] ← ∞
Initially insert ∞ at m [2, 3]
Now, for k ← i to j - 1
k ← 2 to 3 - 1
k ← 2 to 2
i.e. k =2
Now, q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
For l =2
i=2
j=3
k=2
q ← m [2, 2] + m [3, 3] + p1x p2 x p3
q←0+0+1x5x4
q ← 20
Compare q with m [i ,j ]
If q < m [i,j]
i.e. 20 < m [2, 3]
20 < ∞
True
Then m [i,j ] ← q
m [2, 3 ] ← 20
and s [2, 3] ← k
and k = 2
s [2,3] ← 2
3. Now i become 3
i=3
l=2
j←i+l-1
j←3+2-1
j←4
j=4
Now, m [i, j ] ← ∞
m [3,4 ] ← ∞
Insert ∞ at m [3, 4]
for k ← i to j - 1
k ← 3 to 4 - 1
k ← 3 to 3
i.e. k = 3
Now, q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
i=3
l=2
j=4
k=3
q ← m [3, 3] + m [4,4] + p2 x p3 x p4
q←0+0+5x2x4
q 40
Compare q with m [i, j]
If q < m [i, j]
40 < m [3, 4]
40 < ∞
True
Then, m [i,j] ← q
m [3,4] ← 40
and s [3,4] ← k
s [3,4] ← 3
Case 2: l becomes 3
L=3
for i = 1 to n - l + 1
i = 1 to 4 - 3 + 1
i = 1 to 2
When i = 1
j←i+l-1
j←1+3-1
j←3
j=3
Now, m [i,j] ← ∞
m [1, 3] ← ∞
for k ← i to j - 1
k ← 1 to 3 - 1
k ← 1 to 2
Now we compare the value for both k=1 and k = 2. The minimum of two will be placed in m
[i,j] or s [i,j] respectively.
Here, also find the minimum value of m [i,j] for two values of k = 2 and k =3
1. But 28 < ∞
2. So m [i,j] ← q
3. And q ← 28
4. m [2, 4] ← 28
5. and s [2, 4] ← 3
6. e. It means in s table at s [2,4] insert 3 and at m [2,4] insert 28.
Case 3: l becomes 4
L=4
For i ← 1 to n-l + 1
i ← 1 to 4 - 4 + 1
i←1
i=1
do j ← i + l - 1
j←1+4-1
j←4
j=4
Now m [i,j] ← ∞
m [1,4] ← ∞
for k ← i to j -1
k ← 1 to 4 - 1
k ← 1 to 3
When k = 1
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1,1] + m [2,4] + p0xp4x p1
q ← 0 + 28 + 7 x 2 x 1
q ← 42
Compare q and m [i, j]
m [i,j] was ∞
i.e. m [1,4]
if q < m [1,4]
42< ∞
True
Then m [i,j] ← q
m [1,4] ← 42
and s [1,4] 1 ? k =1
When k = 2
L = 4, i=1, j = 4
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1, 2] + m [3,4] + p0 xp2 xp4
q ← 35 + 40 + 7 x 5 x 2
q ← 145
Compare q and m [i,j]
Now m [i, j]
i.e. m [1,4] contains 42.
So if q < m [1, 4]
But 145 less than or not equal to m [1, 4]
So 145 less than or not equal to 42.
So no change occurs.
When k = 3
l=4
i=1
j=4
q ← m [i, k] + m [k + 1, j] + pi-1 pk pj
q ← m [1, 3] + m [4,4] + p0 xp3 x p4
q ← 48 + 0 + 7 x 4 x 2
q ← 114
Again q less than or not equal to m [i, j]
i.e. 114 less than or not equal to m [1, 4]
114 less than or not equal to 42
So no change occurs. So the value of m [1, 4] remains 42. And value of s [1, 4] = 1
We have to sort out all the combination but the minimum output combination is taken into
consideration.
2. m (2, 3) = m2 x m3
= 10 x 3 x 3 x 12
= 10 x 3 x 12 = 360
3. m (3, 4) = m3 x m4
= 3 x 12 x 12 x 20
= 3 x 12 x 20 = 720
4. m (4,5) = m4 x m5
= 12 x 20 x 20 x 7
= 12 x 20 x 7 = 1680
o We initialize the diagonal element with equal i,j value with '0'.
o After that second diagonal is sorted out and we get all the values corresponded to it
Now the third diagonal will be solved out in the same way.
M [1, 3] = M1 M2 M3
1. There are two cases by which we can solve this multiplication: ( M 1 x M2) + M3, M1+
(M2x M3)
2. After solving both cases we choose the case in which minimum output is there.
M [1, 3] =264
As Comparing both output 264 is minimum in both cases so we insert 264 in table and ( M1 x
M2) + M3 this combination is chosen for the output making.
M [2, 4] = M2 M3 M4
1. There are two cases by which we can solve this multiplication: (M 2x M3)+M4,
M2+(M3 x M4)
2. After solving both cases we choose the case in which minimum output is there.
M [2, 4] = 1320
As Comparing both output 1320 is minimum in both cases so we insert 1320 in table and
M2+(M3 x M4) this combination is chosen for the output making.
M [3, 5] = M3 M4 M5
1. There are two cases by which we can solve this multiplication: ( M 3 x M4) + M5, M3+ (
M4xM5)
2. After solving both cases we choose the case in which minimum output is there.
M [3, 5] = 1140
As Comparing both output 1140 is minimum in both cases so we insert 1140 in table and (
M3 x M4) + M5this combination is chosen for the output making.
M [1, 4] = M1 M2 M3 M4
1. ( M1 x M2 x M3) M4
2. M1 x(M2 x M3 x M4)
3. (M1 xM2) x ( M3 x M4)
After solving these cases we choose the case in which minimum output is there
M [1, 4] =1080
As comparing the output of different cases then '1080' is minimum output, so we insert 1080
in the table and (M1 xM2) x (M3 x M4) combination is taken out in output making,
M [2, 5] = M2 M3 M4 M5
There are three cases by which we can solve this multiplication:
1. (M2 x M3 x M4)x M5
2. M2 x( M3 x M4 x M5)
3. (M2 x M3)x ( M4 x M5)
After solving these cases we choose the case in which minimum output is there
M [2, 5] = 1350
As comparing the output of different cases then '1350' is minimum output, so we insert 1350
in the table and M2 x( M3 x M4 xM5)combination is taken out in output making.
M [1, 5] = M1 M2 M3 M4 M5
1. (M1 x M2 xM3 x M4 )x M5
2. M1 x( M2 xM3 x M4 xM5)
3. (M1 x M2 xM3)x M4 xM5
4. M1 x M2x(M3 x M4 xM5)
After solving these cases we choose the case in which minimum output is there
M [1, 5] = 1344
As comparing the output of different cases then '1344' is minimum output, so we insert 1344
in the table and M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.
Final Output is:
Step 3: Computing Optimal Costs: let us assume that matrix Ai has dimension pi-1x pi for
i=1, 2, 3....n. The input is a sequence (p0,p1,......pn) where length [p] = n+1. The procedure
uses an auxiliary table m [1....n, 1.....n] for storing m [i, j] costs an auxiliary table s [1.....n,
1.....n] that record which index of k achieved the optimal costs in computing m [i, j].
The algorithm first computes m [i, j] ← 0 for i=1, 2, 3.....n, the minimum costs for the chain
of length 1.
Generally, tabulation(dynamic
On the other hand, memoization
Approach programming) is an iterative
is a recursive approach.
approach
Optimal Substructure Property: If the optimal solution of the given problem can be
obtained by using optimal solutions of its subproblems then the problem is said to have
Optimal Substructure Property.
Steps to solve a Dynamic programming problem:
1. Identify if it is a Dynamic programming problem.
2. Decide a state expression with the Least parameters.
3. Formulate state and transition relationships.
4. Do tabulation (or memorization).
1) How to classify a problem as a Dynamic Programming algorithm Problem?
Typically, all the problems that require maximizing or minimizing certain quantities
or counting problems that say to count the arrangements under certain conditions or
certain probability problems can be solved by using Dynamic Programming.
All dynamic programming problems satisfy the overlapping subproblems property
and most of the classic Dynamic programming problems also satisfy the optimal
substructure property. Once we observe these properties in a given problem be sure
that it can be solved using Dynamic Programming.
2) Deciding the state:
Problems with dynamic programming are mostly concerned with the state and its
transition. The most fundamental phase must be carried out with extreme care because the
state transition depends on the state definition you select.
State:
A state is a collection of characteristics that can be used to specifically describe a given
position or standing in a given challenge. To minimise state space, this set of parameters
has to be as compact as feasible.
3) Formulating a relation among the states:
The hardest part of a Dynamic Programming challenge is this step, which calls for a lot of
intuition, observation, and training.
Example:
Given 3 numbers {1, 3, 5}, the task is to tell the total number of ways we can form a
number N using the sum of the given three numbers. (allowing repetitions and different
arrangements).
The total number of ways to form 6 is: 8
1+1+1+1+1+1
1+1+1+3
1+1+3+1
1+3+1+1
3+1+1+1
3+3
1+5
5+1
Following are the steps to solve the problem:
We choose a state for the given problem.
N will be used as the determining factor for the state because it can be used to identify
any subproblem.
The DP state will resemble state(N), where the state(N) is the total number of
arrangements required to create N using the elements 1, 3, and 5. Identify the
relationship of the transition between any two states.
We must now calculate the state (N).
3.1) How to Compute the state?
As we can only use 1, 3, or 5 to form a given number N. Let us assume that we know the
result for N = 1, 2, 3, 4, 5, 6
Let us say we know the result for:
state (n = 1), state (n = 2), state (n = 3) ……… state (n = 6)
Now, we wish to know the result of the state (n = 7). See, we can only add 1, 3, and 5.
Now we can get a sum total of 7 in the following 3 ways:
1) Adding 1 to all possible combinations of state (n = 6)
Eg: [ (1+1+1+1+1+1) + 1]
[ (1+1+1+3) + 1]
[ (1+1+3+1) + 1]
[ (1+3+1+1) + 1]
[ (3+1+1+1) + 1]
[ (3+3) + 1]
[ (1+5) + 1]
[ (5+1) + 1]
2) Adding 3 to all possible combinations of state (n = 4);
[(1+1+1+1) + 3]
[(1+3) + 3]
[(3+1) + 3]
3) Adding 5 to all possible combinations of state(n = 2)
[ (1+1) + 5]
(Note how it sufficient to add only on the right-side – all the add-from-left-side cases are
covered, either in the same state, or another, e.g. [ 1+(1+1+1+3)] is not needed in state
(n=6) because it’s covered by state (n = 4) [(1+1+1+1) + 3])
Now, think carefully and satisfy yourself that the above three cases are covering all
possible ways to form a sum total of 7;
Therefore, we can say that result for
state(7) = state (6) + state (4) + state (2)
OR
state(7) = state (7-1) + state (7-3) + state (7-5)
In general,
state(n) = state(n-1) + state(n-3) + state(n-5)
Below is the implementation for the above approach:
// Returns the number of arrangements to
// form 'n'
int solve(int n)
{
// base case
if (n < 0)
return 0;
if (n == 0)
return 1;
Time Complexity: O(n), As we just need to make 3n function calls and there will be no
repetitive calculations as we are returning previously calculated results.
Auxiliary Space: O(n), The extra space is used due to the recursion call stack.
How to solve Dynamic Programming problems through Example?
Problem: Let’s find the Fibonacci sequence up to the nth term. A Fibonacci series is the
sequence of numbers in which each number is the sum of the two preceding ones. For
example, 0, 1, 1, 2, 3, and so on. Here, each number is the sum of the two preceding
numbers.
Naive Approach: The basic way to find the nth Fibonacci number is to use recursion.
Below is the implementation for the above approach:
return x + y;
}
// Drivers code
int main()
{
int n = 5;
// Function Call
cout << fib(n);
return 0;
}
Output
5
Complexity Analysis:
Time Complexity: O(2n)
Here, for every n, we are required to make a recursive call to fib(n – 1) and fib(n – 2).
For fib(n – 1), we will again make the recursive call to fib(n – 2) and fib(n – 3).
Similarly, for fib(n – 2), recursive calls are made on fib(n – 3) and fib(n – 4) until we
reach the base case.
During each recursive call, we perform constant work(k) (adding previous outputs to
obtain the current output). We perform 2nK work at every level (where n = 0, 1, 2, …).
Since n is the number of calls needed to reach 1, we are performing 2n-1k at the final
level. Total work can be calculated as:
If we draw the recursion tree of the Fibonacci recursion then we found the maximum
height of the tree will be n and hence the space complexity of the Fibonacci recursion
will be O(n).
Efficient approach: As it is a very terrible complexity(Exponential), thus we need to
optimize it with an efficient method. (Memoization)
Let’s look at the example below for finding the 5th Fibonacci number.
Observations:
The entire program repeats recursive calls. As in the above figure, for calculating
fib(4), we need the value of fib(3) (first recursive call over fib(3)), and for calculating
fib(5), we again need the value of fib(3)(second similar recursive call over fib(3)).
Both of these recursive calls are shown above in the outlining circle.
Similarly, there are many others for which we are repeating the recursive calls.
Recursion generally involves repeated recursive calls, which increases the program’s
time complexity.
By storing the output of previously encountered values (preferably in arrays, as these
can be traversed and extracted most efficiently), we can overcome this problem. The
next time we make a recursive call over these values, we will use their already stored
outputs instead of calculating them all over again.
In this way, we can improve the performance of our code. Memoization is the process
of storing each recursive call’s output for later use, preventing the code from
calculating it again.
Way to memoize: To achieve this in our example we will simply take an answer array
initialized to -1. As we make a recursive call, we will first check if the value stored in the
answer array corresponding to that position is -1. The value -1 indicates that we haven’t
calculated it yet and have to recursively compute it. The output must be stored in the
answer array so that, next time, if the same value is encountered, it can be directly used
from the answer array.
Now in this process of memoization, considering the above Fibonacci numbers example, it
can be observed that the total number of unique calls will be at most (n + 1) only.
Below is the implementation for the above approach:
// Helper Function
int fibo_helper(int n, int* ans)
{
// Base case
if (n <= 1) {
return n;
}
// Calculate output
int x = fibo_helper(n - 1, ans);
int y = fibo_helper(n - 2, ans);
int fibo(int n)
{
int* ans = new int[n + 1];
// Initializing with -1
for (int i = 0; i <= n; i++) {
ans[i] = -1;
}
fibo_helper(n, ans);
}
// Drivers code
int main()
{
int n = 5;
// Function Call
cout << fibo(n);
return 0;
}
Output
5
Complexity analysis:
Time complexity: O(n)
Auxiliary Space: O(n)
Optimized approach: Following a bottom-up approach to reach the desired index. This
approach of converting recursion into iteration is known as Dynamic programming(DP).
Observations:
Finally, what we do is recursively call each response index field and calculate its value
using previously saved outputs.
Recursive calls terminate via the base case, which means we are already aware of the
answers which should be stored in the base case indexes.
In the case of Fibonacci numbers, these indices are 0 and 1 as f(ib0) = 0 and f(ib1) =
1. So we can directly assign these two values into our answer array and then use them
to calculate f(ib2), which is f(ib1) + f(ib0), and so on for each subsequent index.
This can easily be done iteratively by running a loop from i = (2 to n). Finally, we get
our answer at the 5th index of the array because we already know that the ith index
contains the answer to the ith value.
Simply, we first try to find out the dependence of the current value on previous values
and then use them to calculate our new value. Now, we are looking for those values
which do not depend on other values, which means they are independent(base case
values, since these, are the smallest problems
which we are already aware of).
Below is the implementation for the above approach:
// C++ code for the above approach:
#include <iostream>
using namespace std;
// Drivers code
int main()
{
int n = 5;
// Function Call
cout << fibo(n);
return 0;
}
Output
5
Complexity analysis:
Time complexity: O(n)
Auxiliary Space: O(n)
Optimization of above method
in above code we can see that the current state of any fibonacci number depend only on
prev two number
so using this observation we can conclude that we did not need to store the whole table
of size n but instead of that we can only store the prev two values
so this way we can optimize the space complexity in the above code O(n) to O(1)
// Drivers code
int main()
{
int n = 5;
// Function Call
cout << fibo(n);
return 0;
}
Output
5
A Dynamic programming is an
A greedy method follows the
algorithmic technique which is
problem solving heuristic of
Recursion usually based on a recurrent
making the locally optimal
formula that uses some previously
choice at each stage.
calculated states.
Fractional knapsack .
Example 0/1 knapsack problem
Longest Common Sequence (LCS)
A subsequence of a given sequence is just the given sequence with some elements left out.
Given two sequences X and Y, we say that the sequence Z is a common sequence of X and Y
if Z is a subsequence of both X and Y.
In the longest common subsequence problem, we are given two sequences X = (x1 x2....xm)
and Y = (y1 y2 yn) and wish to find a maximum length common subsequence of X and Y.
LCS Problem can be solved using dynamic programming.
A brute-force approach we find all the subsequences of X and check each subsequence to see
if it is also a subsequence of Y, this approach requires exponential time making it impractical
for the long sequence.
Given a sequence X = (x1 x2.....xm) we define the ith prefix of X for i=0, 1, and 2...m as Xi=
(x1 x2.....xi). For example: if X = (A, B, C, B, C, A, B, C) then X 4= (A, B, C, B)
Optimal Substructure of an LCS: Let X = (x1 x2....xm) and Y = (y1 y2.....) yn) be the
sequences and let Z = (z1 z2......zk) be any LCS of X and Y.
Step 2: Recursive Solution: LCS has overlapping subproblems property because to find
LCS of X and Y, we may need to find the LCS of X m-1 and Yn-1. If xm ≠ yn, then we must
solve two subproblems finding an LCS of X and Yn-1.Whenever of these LCS's longer is an
LCS of x and y. But each of these subproblems has the subproblems of finding the LCS of
Xm-1 and Yn-1.
Let c [i,j] be the length of LCS of the sequence X iand Yj.If either i=0 and j =0, one of
the sequences has length 0, so the LCS has length 0. The optimal substructure of the LCS
problem given the recurrence formula
Step3: Computing the length of an LCS: let two sequences X = (x1 x2.....xm) and Y =
(y1 y2..... yn) as inputs. It stores the c [i,j] values in the table c [0......m,0..........n].Table b
[1..........m, 1..........n] is maintained which help us to construct an optimal solution. c [m, n]
contains the length of an LCS of X,Y.
The longest common subsequence problem is finding the longest sequence which
exists in both the given strings.
But before we understand the problem, let us understand what the term subsequence is −
Let us consider a sequence S = <s 1, s2, s3, s4, …,sn>. And another sequence Z = <z1, z2,
z3, …,zm> over S is called a subsequence of S, if and only if it can be derived from S deletion
of some elements. In simple words, a subsequence consists of consecutive elements that
make up a small part in a sequence.
Common Subsequence
Suppose, X and Y are two sequences over a finite set of elements. We can say that Z is a
common subsequence of X and Y, if Z is a subsequence of both X and Y.
If a set of sequences are given, the longest common subsequence problem is to find a
common subsequence of all the sequences that is of maximal length.
Naive Method
Let X be a sequence of length m and Y a sequence of length n. Check for every subsequence
of X whether it is a subsequence of Y, and return the longest common subsequence found.
Step 1 − Construct an empty adjacency table with the size, n × m, where n = size of
sequence X and m = size of sequence Y. The rows in the table represent the elements in
sequence X and columns represent the elements in sequence Y.
Step 2 − The zeroeth rows and columns must be filled with zeroes. And the remaining values
are filled in based on different cases, by maintaining a counter value.
Step 3 − Once the table is filled, backtrack from the last value in the table. Backtracking here
is done by tracing the path where the counter incremented first.
Step 4 − The longest common subseqence obtained by noting the elements in the traced path.
Pseudocode
In this procedure, table C[m, n] is computed in row major order and another table B[m,n] is
computed to construct optimal solution.
Analysis
To populate the table, the outer for loop iterates m times and the inner for loop
iterates n times. Hence, the complexity of the algorithm is O(m,n), where m and n are the
length of two strings.
Example
In this example, we have two strings X=BACDB and Y=BDCB to find the longest common
subsequence.
X = BDCB, Y = BACDB
In the table below, the zeroeth rows and columns are filled with zeroes. Remianing values are
filled by incrementing and choosing the maximum values according to the algorithm.
Once the values are filled, the path is traced back from the last value in the table at T[4, 5].
From the traced path, the longest common subsequence is found by choosing the values
where the counter is first incremented.
In this example, the final count is 3 so the counter is incremented at 3 places, i.e., B, C, B.
Therefore, the longest common subsequence of sequences X and Y is BCB.
Implementation
Following is the final implementation to find the Longest Common Subsequence using
Dynamic Programming Approach −
CC++
#include <stdio.h>
#include <string.h>
int max(int a, int b);
int lcs(char* X, char* Y, int m, int n){
int L[m + 1][n + 1];
int i, j, index;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else
L[i][j] = max(L[i - 1][j], L[i][j - 1]);
}
}
index = L[m][n];
char LCS[index + 1];
LCS[index] = '\0';
i = m, j = n;
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
LCS[index - 1] = X[i - 1];
i--;
j--;
index--;
} else if (L[i - 1][j] > L[i][j - 1])
i--;
else
j--;
}
printf("LCS: %s\n", LCS);
return L[m][n];
}
int max(int a, int b){
return (a > b) ? a : b;
}
int main(){
char X[] = "ABSDHS";
char Y[] = "ABDHSP";
int m = strlen(X);
int n = strlen(Y);
printf("Length of LCS is %d\n", lcs(X, Y, m, n));
return 0;
}
Output
LCS: ABDHS
Length of LCS is 5
Applications
The longest common subsequence problem is a classic computer science problem, the basis
of data comparison programs such as the diff-utility, and has applications in bioinformatics.
It is also widely used by revision control systems, such as SVN and Git, for reconciling
multiple changes made to a revision-controlled collection of files.
Greedy Algorithm:
The greedy method is one of the strategies like Divide and conquer used to solve the
problems. This method is used for solving optimization problems. An optimization problem
is a problem that demands either maximum or minimum results. Let's understand through
some terms.
This technique is basically used to determine the feasible solution that may or may not
be optimal. The feasible solution is a subset that satisfies the given criteria. The optimal
solution is the solution which is the best and the most favorable solution in the subset. In the
case of feasible, if more than one solution satisfies the given criteria then those solutions will
be considered as the feasible, whereas the optimal solution is the best solution among all the
solutions.
o To construct the solution in an optimal way, this algorithm creates two sets where one
set contains all the chosen items, and another set contains the rejected items.
o A Greedy algorithm makes good local choices in the hope that the solution should be
either feasible or optimal.
o Candidate set: A solution that is created from the set is known as a candidate set.
o Selection function: This function is used to choose the candidate or subset which can
be added in the solution.
o Feasibility function: A function that is used to determine whether the candidate or
subset can be used to contribute to the solution or not.
o Objective function: A function is used to assign the value to the solution or the partial
solution.
o Solution function: This function is used to intimate whether the complete function has
been reached or not.
The above is the greedy algorithm. Initially, the solution is assigned with zero value.
We pass the array and number of elements in the greedy algorithm. Inside the for loop, we
select the element one by one and checks whether the solution is feasible or not. If the
solution is feasible, then we perform the union.
P:A→B
The problem is that we have to travel this journey from A to B. There are various solutions to
go from A to B. We can go from A to B by walk, car, bike, train, aeroplane, etc. There is a
constraint in the journey that we have to travel this journey within 12 hrs. If I go by train or
aeroplane then only, I can cover this distance within 12 hrs. There are many solutions to this
problem but there are only two solutions that satisfy the constraint.
If we say that we have to cover the journey at the minimum cost. This means that we have to
travel this distance as minimum as possible, so this problem is known as a minimization
problem. Till now, we have two feasible solutions, i.e., one by train and another one by air.
Since travelling by train will lead to the minimum cost so it is an optimal solution. An
optimal solution is also the feasible solution, but providing the best result so that solution is
the optimal solution with the minimum cost. There would be only one optimal solution.
The problem that requires either minimum or maximum result then that problem is known as
an optimization problem. Greedy method is one of the strategies used for solving the
optimization problems.
Greedy algorithm makes decisions based on the information available at each phase without
considering the broader problem. So, there might be a possibility that the greedy solution
does not give the best solution for every problem.
It follows the local optimum choice at each stage with a intend of finding the global
optimum. Let's understand through an example.
We have to travel from the source to the destination at the minimum cost. Since we
have three feasible solutions having cost paths as 10, 20, and 5. 5 is the minimum cost path
so it is the optimal solution. This is the local optimum, and in this way, we find the local
optimum at each stage in order to calculate the global optimal solution.
In the above graph starting from the root node 10 if we greedily select the next node to
obtain the most weighted path the next selected node will be 5 that will take the total
sum to 15 and the path will end as there is no child of 5 but the path 10 -> 5 is not the
maximum weight path.
In order to find the most weighted path all possible path sum must be computed and
their path sum must be compared to get the desired result, it is visible that the most
weighted path in the above graph is 10 -> 1 -> 30 that gives the path sum 41.
Correct Approach
In such cases Greedy approach wouldn’t work instead complete paths from root to leaf
node has to be considered to get the correct answer i.e. the most weighted path, This
can be achieved by recursively checking all the paths and calculating their weight.
Thus to use Greedy algorithm the problem must not contain overlapping
subproblems.
Greedy Algorithm Vs Dynamic Programming
Greedy algorithm and Dynamic programming are two of the most widely used algorithm
paradigms for solving complex programming problems, While Greedy approach works for
problems where local optimal choice leads to global optimal solution Dynamic
Programming works for problems having overlapping subproblems structure where
answer to a subproblem is needed for solving several other subproblems. Detailed
differences are given in the table below:
Feature Greedy Algorithm Dynamic Programming
In Dynamic Programming
In a Greedy Algorithm, we we make decision at each
make whatever choice step considering current
Feasibility seems best at the moment problem and solution to
in the hope that it will lead previously solved sub
to global optimal solution. problem to calculate optimal
solution .
It is guaranteed that
In Greedy Method, Dynamic Programming will
sometimes there is no such generate an optimal solution
Optimality
guarantee of getting as it generally considers all
Optimal Solution. possible cases and then
choose the best.
Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources
which can be used by only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each
Activity "i" has start time si and a finish time fi, where si ≤fi. If selected activity "i" take
place meanwhile the half-open time interval [si,fi). Activities i and j are compatible if the
intervals (si, fi) and [si, fi) do not overlap (i.e. i and j are compatible if si ≥fi or si ≥fi). The
activity-selection problem chosen the maximum- size set of mutually consistent activities.
Example: Given 10 activities along with their start and end time as
S = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)
Si = (1,2,3,4,7,8,9,9,11,12)
fi = (3,5,4,7,10,9,11,13,12,14)
Solution: The solution to the above Activity scheduling problem using a greedy strategy is
illustrated below:
Now, schedule A1
Skip A5 as it is interfering.
Skip A8 as it is interfering.
Huffman coding :
Algorithm:
The method which is used to construct optimal prefix code is called Huffman coding.
This algorithm builds a tree in bottom up manner. We can denote this tree by T
Let, |c| be number of leaves
|c| -1 are number of operations required to merge the nodes. Q be the priority queue which
can be used while constructing binary heap.
Algorithm Huffman (c)
{
n= |c|
Q=c
for i<-1 to n-1
do
{
3. Create a new internal node with a frequency equal to the sum of the two nodes
frequencies. Make the first extracted node as its left child and the other extracted node
as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the
root node and the tree is complete.
Let us understand the algorithm with an example:
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree
with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node
with frequency 5 + 9 = 14.
Illustration of step 2
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each,
and one heap node is root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with
frequency 12 + 13 = 25
Illustration of step 3
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each,
and two heap nodes are root of tree with more than one nodes
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency
14 + 16 = 30
Illustration of step 4
Illustration of step 5
Illustration of step 6
return temp;
}
// current size is 0
minHeap->size = 0;
minHeap->capacity = capacity;
// A utility function to
// swap two min heap nodes
void swapMinHeapNode(struct MinHeapNode** a,
struct MinHeapNode** b)
if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
--minHeap->size;
minHeapify(minHeap, 0);
return temp;
}
++minHeap->size;
int i = minHeap->size - 1;
while (i
&& minHeapNode->freq
< minHeap->array[(i - 1) / 2]->freq) {
minHeap->array[i] = minHeapNode;
}
int n = minHeap->size - 1;
int i;
printf("\n");
}
// Utility function to check if this node is leaf
int isLeaf(struct MinHeapNode* root)
minHeap->size = size;
buildMinHeap(minHeap);
return minHeap;
}
{
struct MinHeapNode *left, *right, *top;
top->left = left;
top->right = right;
insertMinHeap(minHeap, top);
}
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
{
// Construct Huffman Tree
struct MinHeapNode* root
= buildHuffmanTree(data, freq, size);
// Driver code
int main()
{
return 0;
}
Output
f: 0
c: 100
d: 101
a: 1100
b: 1101
e: 111
Time complexity: O(nlogn) where n is the number of unique characters. If there are
n nodes, extractMin() is called 2*(n – 1) times. extractMin() takes O(logn) time as it calls
minHeapify(). So, the overall complexity is O(nlogn).
If the input array is sorted, there exists a linear time algorithm. We will soon be discussing
this in our next post.
Space complexity :- O(N)
Applications of Huffman Coding:
1. They are used for transmitting fax and text.
2. They are used by conventional compression formats like PKZIP, GZIP, etc.
3. Multimedia codecs like JPEG, PNG, and MP3 use Huffman encoding(to be more
precise the prefix codes).
NP-Completeness:
A decision problem L is NP-Hard if
Definition: L is NP-complete if
1. L ϵ NP and
2. L' ≤ p L for some known NP-complete problem L.' Given this formal definition, the
complexity classes are:
NP: is the set of decision problems that can be verified in polynomial time.
NP-Hard: L is NP-hard if for all L' ϵ NP, L' ≤p L. Thus if we can solve L in polynomial
time, we can solve all NP problems in polynomial time.
NP-Complete L is NP-complete if
1. L ϵ NP and
2. L is NP-hard
If any NP-complete problem is solvable in polynomial time, then every NP-Complete
problem is also solvable in polynomial time. Conversely, if we can prove that any NP-
Complete problem cannot be solved in polynomial time, every NP-Complete problem cannot
be solvable in polynomial time.
Reductions
Concept: - If the solution of NPC problem does not exist then the conversion from one NPC
problem to another NPC problem within the polynomial time. For this, you need the concept
of reduction. If a solution of the one NPC problem exists within the polynomial time, then
the rest of the problem can also give the solution in polynomial time (but it's hard to believe).
For this, you need the concept of reduction.
Example: - Suppose there are two problems, A and B. You know that it is impossible to
solve problem A in polynomial time. You want to prove that B cannot be solved in
polynomial time. So you can convert the problem A into problem B in polynomial time.
1. The point to be noted here, the output is already given, and you can verify the
output/solution within the polynomial time but can't produce an output/solution in
polynomial time.
2. Here we need the concept of reduction because when you can't produce an output of
the problem according to the given input then in case you have to use an emphasis on
the concept of reduction in which you can convert one problem into another problem.
Note1:- If you satisfy both points then your problem comes into the category of NP-complete
class
Note2:- If you satisfy the only 2nd points then your problem comes into the category of NP-
hard class
So according to the given decision-based NP problem, you can decide in the form of yes or
no. If, yes then you have to do verify and convert into another problem via reduction concept.
If you are being performed, both then decision-based NP problems are in NP compete.
Many problems are hard to solve, but they have the property that it easy to authenticate the
solution if one is provided.
Consider the Hamiltonian cycle problem. Given an undirected graph G, does G have a cycle
that visits each vertex exactly once? There is no known polynomial time algorithm for this
dispute.
Note: - It means you can't build a Hamiltonian cycle in a graph with a polynomial time even
if there is no specific path is given for the Hamiltonian cycle with the particular vertex, yet
you can't verify the Hamiltonian cycle within the polynomial time
Fig: Hamiltonian Cycle
Let us understand that a graph did have a Hamiltonian cycle. It would be easy for
someone to convince of this. They would similarly say: "the period is hv3, v7, v1....v13i.
We could then inspect the graph and check that this is indeed a legal cycle and that it
visits all of the vertices of the graph exactly once. Thus, even though we know of no efficient
way to solve the Hamiltonian cycle problem, there is a beneficial way to verify that a given
cycle is indeed a Hamiltonian cycle.
1. P contains in NP
2. P=NP
Reductions:
The class NP-complete (NPC) problems consist of a set of decision problems (a subset
of class NP) that no one knows how to solve efficiently. But if there were a polynomial
solution for even a single NP-complete problem, then every problem in NPC will be solvable
in polynomial time. For this, we need the concept of reductions.
Suppose there are two problems, A and B. You know that it is impossible to solve
problem A in polynomial time. You want to prove that B cannot be explained in polynomial
time. We want to show that (A ∉ P) => (B ∉ P)
Coloring arises in various partitioning issues where there is a constraint that two
objects cannot be assigned to the same set of partitions. The phrase "coloring" comes from
the original application which was in map drawing. Two countries that contribute a common
border should be colored with different colors.
It is well known that planar graphs can be colored (maps) with four colors. There
exists a polynomial time algorithm for this. But deciding whether this can be done with 3
colors is hard, and there is no polynomial time algorithm for it.
NP completeness proof:
Figure 28.12.1: We will use this sequence of reductions for the NP Complete Proof
NP complete Problems:
The NP problems set of problems whose solutions are hard to find but easy to verify and
are solved by Non-Deterministic Machine in polynomial time.
NP-Hard Problem:
A Problem X is NP-Hard if there is an NP-Complete problem Y, such that Y is reducible
to X in polynomial time. NP-Hard problems are as hard as NP-Complete problems. NP-
Hard Problem need not be in NP class.
If every problem of NP can be polynomial time reduced to it called as NP Hard.
A lot of times takes the particular problem solve and reducing different problems.
example :
1. Hamiltonian cycle .
2. optimization problem .
3. Shortest path
NP-Complete Problem:
A problem X is NP-Complete if there is an NP problem Y, such that Y is reducible to X in
polynomial time. NP-Complete problems are as hard as NP problems. A problem is NP-
Complete if it is a part of both NP and NP-Hard Problem. A non-deterministic Turing
machine can solve NP-Complete problem in polynomial time.
A problem is np-complete when it is both np and np hard combines together.
this means np complete problems can be verified in polynomial time.
Example:
1. Decision problems.
2. Regular graphs.
Difference between NP-Hard and NP-Complete:
NP-hard NP-Complete
To solve this problem, it do not have to To solve this problem, it must be both NP and
be in NP . NP-hard problems.