0% found this document useful (0 votes)
3 views81 pages

6 Dynamic Programming

The document covers dynamic programming as a powerful algorithm design technique for solving optimization problems, emphasizing its practical applications and efficiency compared to naive algorithms. It provides examples, including the Fibonacci sequence and the shortest path problem, illustrating how dynamic programming can optimize recursive solutions by avoiding redundant computations. Additionally, it introduces the knapsack problem, detailing its formulation and the dynamic programming approach to find optimal solutions efficiently.

Uploaded by

utkumavi10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views81 pages

6 Dynamic Programming

The document covers dynamic programming as a powerful algorithm design technique for solving optimization problems, emphasizing its practical applications and efficiency compared to naive algorithms. It provides examples, including the Fibonacci sequence and the shortest path problem, illustrating how dynamic programming can optimize recursive solutions by avoiding redundant computations. Additionally, it introduces the knapsack problem, detailing its formulation and the dynamic programming approach to find optimal solutions efficiently.

Uploaded by

utkumavi10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

BIL206: Algorithm Analysis and Design

Dynamic Programming

Slides courtesy from Jianhua Ruan, The University of Texas at San Antonio
And some slides from Jeff Edmonds @ York University
5/15/2023 1
In the next few lectures

• Two important algorithm design techniques


– Dynamic programming
– Greedy algorithm
• Meta algorithms, not actual algorithms (like
divide-and-conquer)
• Very useful in practice
• Once you understand them, it is often easier to
reinvent certain algorithms than trying to look
them up!
Optimization Problems
• An important and practical class of computational
problems.
– For each problem instance, there are many feasible solutions
(often exponential)
– Each solution has a value (or cost)
– The goal is to find a solution with the optimal value
• For most of these, the best known algorithm runs in
exponential time.
• Industry would pay dearly to have faster algorithms.
• For some of them, there are quick dynamic
programming or greedy algorithms.
• For the rest, you may have to consider acceptable
solutions rather than optimal solutions.
Dynamic programming

• The word “programming” is historical and predates


computer programming.
• A very powerful, general tool for solving certain
optimization problems.
• Once understood it is relatively easy to apply.
• It looks like magic until you have seen enough
examples.
A motivating example: Fibonacci
numbers
• 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

F(0) = 1;
F(1) = 1;
F(n) = F(n-1) + f(n-2)

• How to compute F(n)?


A recursive algorithm
function fib(n)
if (n == 0 or n == 1) return 1;
else return fib(n-1) + fib(n-2);

What’s the time complexity?


F(9)
F(8) F(7)

h=n F(7) F(6) F(6) F(5) h = n/2

F(6) F(5) F(5) F(4) F(5) F(4) F(4) F(3)

• Top-down solution: divide & conquer - > O(2^n)


• Time complexity between 2n/2 and 2n
T(n) = F(n) = O(1.6n)
An iterative algorithm
function fib(n)
F[0] = 1; F[1] = 1;
for i = 2 to n
F[i] = F[i-1] + F[i-2];
Return F[n];

Time complexity?
An iterative algorithm
• Problem with the recursive Fib algorithm:
– Each subproblem was solved for many times!

• Solution: avoid solving the same subproblem more than


once
(1) pre-compute all subproblems that may be needed later
(2) Compute on demand, but memorize the solution to avoid
recomputing

• Can you always speedup a recursive algorithm by making it


an iterative algorithm?
– E.g., merge sort
– No. since there is no overlap between the two sub-problems
An iterative algorithm
• bottom-up solution: dynamic programming
• calculate F2, F3, then F4 on a table - > O(n)
Another motivating example:
Shortest path problem

start

goal

• Find the shortest path from start to goal


• An optimization problem
Possible approaches
• Brute-force algorithm
– Enumerate all possible paths and compare
– How many?
• Greedy algorithm
– Always use the currently shortest edge
– Does not necessarily lead to the optimal solution
• Can you think about a recursive strategy?
Optimal subpaths
• Claim: if a path startgoal is optimal, any sub-path xy,
where x, y is on the optimal path, is also optimal.
b
a c a + b + c is shortest
start x y goal
b’ < b
b’ a + b’ + c < a + b + c
• Proof by contradiction
– If the subpath b between x and y is not the shortest
– we can replace it with a shorter one, b’
– which will reduce the total length of the new path
– the optimal path from start to goal is not the shortest =>
contradiction!
– Hence, the subpath b must be the shortest among all paths from x
to y
Shortest path problem

dist(start, A) + SP(A, goal)


SP(start, goal) = min dist(start, B) + SP(B, goal)
dist(start, C) + SP(C, goal)
A special shortest path problem
S

G
n
Each edge has a length (cost). We need to get to G from S. Can only move
to the right or bottom. Aim: find a path with the minimum total length
Use a greedy approach

S 3 9 1 2
0

5 3 3 3 3
3 2 5 2

2 3 3 9 3
2 4 2 3

6 2 3 7 4
3 6 3 3

4 6 3 1 3
1 2 3 2
Recursive solution
• Suppose we’ve found the shortest path n
• It must use one of the two edges:
– (m, n-1) to (m, n) Case 1
– (m-1, n) to (m, n) Case 2
• If case 1
– find shortest path from (0, 0) to (m, n-1)
– SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the
m
overall shortest path
• If case 2
– find shortest path from (0, 0) to (m-1, n)
– SP(0, 0, m-1, n) + dist(m-1, n, m, n) is the overall shortest path
• We don’t know which case is true
– But if we’ve found the two paths, we can compare
– Actual shortest path is the one with shorter overall length
Recursive solution
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n
F(m-1, n) + dist(m-1, n, m, n)
F(m, n) = min
F(m, n-1) + dist(m, n-1, m, n)
Generalize

F(i-1, j) + dist(i-1, j, i, j)
m
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
• To find shortest path from
(0, 0) to (m, n), we need to
find shortest path to both (m-
1, n) and (m, n-1)
m • If we use recursive call, some
subpaths will be recomputed
for many times
• Strategy: solve subproblems in
certain order such that
redundant computation can
n be avoided
Dynamic Programming
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n

F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
m
i = 1 .. m, j = 1 .. n
Boundary condition: i = 0 or j = 0.
Easy to figure out. Number of unique subproblems =
m*n
(i-1, j)
Data dependency determines
order to compute: left to right, top
(i, j-1)
to bottom
(i, j)
Dynamic programming illustration

S 3 9 1 2
0

5 3 3 3 3
3 2 5 2

2 3 3 9 3
2 4 2 3

6 2 3 7 4
3 6 3 3

4 6 3 1 3
1 2 3 2

F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
Dynamic programming illustration

S 3 9 1 2
0 3 12 13 15

5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20

4 6 3 1 3
1 2 3 2
17 17 17 18 20
G
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
Trace back

S 3 9 1 2
0 3 12 13 15

5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20

4 6 3 1 3
1 2 3 2
17 17 17 18 20
G

Shortest path has length 20


What is the actual shortest path?
Elements of dynamic programming
• Optimal sub-structures
– Optimal solutions to the original problem contains
optimal solutions to sub-problems
• Overlapping sub-problems
– Some sub-problems appear in many solutions
• Memorization and reuse
– Carefully choose the order that sub-problems are solved,
such that each sub-problem will be solved for at most
once and the solution can be reused
Two steps to dynamic programming

• Formulate the solution as a recurrence relation of


solutions to subproblems.
• Specify an order to solve the subproblems so you
always have what you need.
– Bottom-up
• Tabulate the solutions to all subproblems before they are used
– What if you cannot determine the order easily, of if not
all subproblems are needed?
– Top-down
• Compute when needed.
• Remember the ones you’ve computed
Question
• If instead of shortest path, we want (simple)
longest path, can we still use dynamic
programming?
– Simple means no cycle allowed
Example problems that can be
solved by dynamic programming
• Sequence alignment problem (several different
versions)
• Shortest path in general graphs
• Knapsack (several different versions)
• Scheduling
• etc.
• We’ll discuss alignment, knapsack, and scheduling
problems.
Knapsack problem

• Each item has a value and a weight


• Objective: maximize value
• Constraint: knapsack has a weight limitation

Three versions:
0-1 knapsack problem: take each
item or leave it
Fractional knapsack problem:
items are divisible
Unbounded knapsack problem:
unlimited supplies of each item.
Which one is easiest to solve?

We study the 0-1 problem today.


Formal definition (0-1 problem)
• Knapsack has weight limit W
• Items labeled 1, 2, …, n (arbitrarily)
• Items have weights w1, w2, …, wn
– Assume all weights are integers
– For practical reason, only consider wi < W
• Items have values v1, v2, …, vn
• Objective: find a subset of items, S, such that
iS wi  W and iS vi is maximal among all such
(feasible) subsets
Naïve algorithms
• Enumerate all subsets.
– Optimal. But exponential time
• Greedy 1: take the item with the largest value
– Not optimal
– Give an example
• Greedy 2: take the item with the largest
value/weight ratio
– Not optimal
– Give an example
A DP algorithm
• Suppose you’ve find the optimal solution S
• Case 1: item n is included
• Case 2: item n is not included

wn wn

Total weight limit: Total weight limit:


W W

Find an optimal solution using items Find an optimal solution using items
1, 2, …, n-1 with weight limit W - wn 1, 2, …, n-1 with weight limit W
Dynamic Approach for Knapsack
• We need to derive a recurrence relation that
expresses a solution to an instance of the problem
in terms of solutions to its smaller subinstances.
• Consider an instance defined by the first i items
1<=i<=n
– with weights w1, …, wi
– values v1, …, vi
– capacity j, 1<=j<=W
– V[i, j] be the value of an optimal solution to this
instance the most suitable subset of first i items that fit
into the knapsack of capacity j.
Dynamic Approach for Knapsack
• We can divide all subsets of the first i items that
fit the knapsack of capacity w into two categories
– 1. Among the subsets that do not include the ith item,
the value of an optimal subset is, V[ i-1, w]
– 2. Among the subsets that do include the ith item, an
optimal subset is made up of this item and an optimal
subset of the first i-1 items that fit into the knapsack
of capacity w-wi (w-wi>=0)
– The value of such an optimal subset is vi+ V[i-1, w-wi]
Recursive formulation

• Let V[i, w] be the optimal total value when items 1, 2, …, i


are considered for a knapsack with weight limit w
=> V[n, W] is the optimal solution
V[n-1, W-wn] + vn
V[n, W] = max
V[n-1, W]

Generalize

V[i-1, w-wi] + vi item i is taken


V[i, w] = max
V[i-1, w] item i not taken

V[i-1, w] if wi > w item i not taken

Boundary condition: V[i, 0] = 0, V[0, w] = 0. Number of sub-problems = ?


Example
• n = 6 (# of items)
• W = 10 (weight limit)
• Items (weight, value):
2 2
4 3
3 3
5 6
2 4
6 9
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0

1 2 2 0

2 4 3 0 wi

3 3 3 0 V[i-1, w-wi] V[i-1, w]

4 5 6 0 V[i, w]

5 2 4 0

6 6 9 0

V[i-1, w-wi] + vi item i is taken


max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0

1 2 2 0

2 4 3 0

3 3 3 0

4 5 6 0

5 2 4 0

6 6 9 0

V[i-1, w-wi] + vi item i is taken


max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0

1 2 2 0 0 2 2 2 2 2 2 2 2 2

2 4 3 0 0 2 2 3 3 5 5 5 5 5

3 3 3 0 0 2 3 3 5 5 6 6 8 8

4 5 6 0 0 2 3 3 6 6 8 9 9 11

5 2 4 0 0 4 4 6 7 7 10 10 12 13

6 6 9 0 0 4 4 6 7 9 10 13 13 15

V[i-1, w-wi] + vi item i is taken


max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0

1 2 2 0 0 2 2 2 2 2 2 2 2 2

2 4 3 0 0 2 2 3 3 5 5 5 5 5

3 3 3 0 0 2 3 3 5 5 6 6 8 8

4 5 6 0 0 2 3 3 6 6 8 9 9 11

5 2 4 0 0 4 4 6 7 7 10 10 12 13

6 6 9 0 0 4 4 6 7 9 10 13 13 15

Optimal value: 15
Item: 6, 5, 1
Weight: 6 + 2 + 2 = 10
Value: 9 + 4 + 2 = 15
Binary Knapsack
Time complexity
• Θ (nW)
• Polynomial?
– Pseudo-polynomial
– Works well if W is small
• Consider following items (weight, value):
(10, 5), (15, 6), (20, 5), (18, 6)
• Weight limit 35 Events scheduling problem
– Optimal solution: item 2, 4 (value = 12). Iterate: 2^4 = 16
subsets
– Dynamic programming: fill up a 4 x 35 = 140 table entries
• What’s the problem?
– Many entries are unused: no such weight combination
– Top-down may be better
Sequence alignment
• Compare two strings to see if they are similar
– We need to define similarity
– Very useful in many applications
– Comparing DNA sequences, articles, source code, etc.

– Example: Longest Common Subsequence problem


(LCS)
Common subsequence
• A subsequence of a string is the string with zero
or more chars left out
• A common subsequence of two strings:
– A subsequence of both strings
– Ex: x = {A B C B D A B }, y = {B D C A B A}
– {B C} and {A A} are both common subsequences of x
and y
Longest Common Subsequence
• Given two sequences x[1 . . m] and y[1 . . n], find a
longest subsequence common to them both.

“a” not “the”

x: A B C B D A B
BCBA =
LCS(x, y)
y: B D C A B A
functional notation,
but not a function
Brute-force LCS algorithm
Check every subsequence of x[1 . . m] to see
if it is also a subsequence of y[1 . . n].
Analysis
• 2m subsequences of x (each bit-vector of length m
determines a distinct subsequence of x).
• Hence, the runtime would be exponential !

Towards a better algorithm: a DP strategy


• Key: optimal substructure and overlapping sub-
problems
• First we’ll find the length of LCS. Later we’ll modify
the algorithm to find LCS itself.
Optimal substructure

• Notice that the LCS problem has optimal substructure:


parts of the final solution are solutions of subproblems.
– If z = LCS(x, y), then any prefix of z is an LCS of a prefix of
x and a prefix of y.
i m
x

z
n
y
j

• Subproblems: “find LCS of pairs of prefixes of x and y”


Recursive thinking

m
x

n
y

• Case 1: x[m]=y[n]. There is an optimal LCS that matches


x[m] with y[n]. Find out LCS (x[1..m-1], y[1..n-1])
• Case 2: x[m] y[n]. At most one of them is in LCS
– Case 2.1: x[m] not in LCS Find out LCS (x[1..m-1], y[1..n])
– Case 2.2: y[n] not in LCS Find out LCS (x[1..m], y[1..n-1])
Recursive thinking

m
x

n
y

• Case 1: x[m]=y[n] Reduce both sequences by 1 char

– LCS(x, y) = LCS(x[1..m-1], y[1..n-1]) || x[m]


• Case 2: x[m]  y[n] concatenate
– LCS(x, y) = LCS(x[1..m-1], y[1..n]) or
LCS(x[1..m], y[1..n-1]), whichever is longer
Reduce either sequence by 1 char
Finding length of LCS

m
x

n
y

• Let c[i, j] be the length of LCS(x[1..i], y[1..j])


=> c[m, n] is the length of LCS(x, y)
• If x[m] = y[n]
c[m, n] = c[m-1, n-1] + 1
• If x[m] != y[n]
c[m, n] = max { c[m-1, n], c[m, n-1] }
Generalize: recursive formulation

c[i–1, j–1] + 1 if x[i] = y[j],


c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.

1 2 i m
x: ...
1 2 j n
y: ...
Recursive algorithm for LCS
LCS(x, y, i, j)
if x[i] = y[ j]
then c[i, j]  LCS(x, y, i–1, j–1) + 1
else c[i, j]  max{ LCS(x, y, i–1, j),
LCS(x, y, i, j–1)}
Worst-case: x[i]  y[ j], in which case the
algorithm evaluates two subproblems, each
with only one parameter decremented.
Recursion tree
m = 3, n = 4: 3,4

2,4 same 3,3


subproblem
1,4 2,3 2,3 3,2 m+n

1,3 2,2 1,3 2,2

Height = m + n  work potentially exponential.,


but we’re solving subproblems already solved!
DP Algorithm
• Key: find out the correct order to solve the sub-problems
• Total number of sub-problems: m * n

c[i–1, j–1] + 1 if x[i] = y[j],


c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.

0 j n

i C(i, j)

m
DP Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y[0]
4. for j = 1 to n c[0,j] = 0 // special case: X[0]
5. for i = 1 to m // for all X[i]
6. for j = 1 to n // for all Y[j]
7. if ( X[i] == Y[j])
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
LCS Example
We’ll see how LCS algorithm works on the following
example:
• X = ABCB
• Y = BDCAB

What is the LCS of X and Y?

LCS(X, Y) = BCB
X =AB C B
Y= B DCAB
LCS Example (0) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
A
1

2 B

3 C

4 B

X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,6]
LCS Example (1) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0

2 B
0
3 C 0
4 B 0

for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
LCS Example (2) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0

2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (3) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0

2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (4) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1

2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (5) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (6) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (7) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (8) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (9) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (10) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (11) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (12) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (13) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (14) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Algorithm Running Time

• LCS algorithm calculates the values of each entry of


the array c[m,n]
• So what is the running time?

O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
How to find actual LCS
• The algorithm just found the length of LCS, but not LCS itself.
• How to find the actual LCS?
• For each c[i,j] we know how it was acquired:

c[i  1, j  1]  1 if x[i]  y[ j ],
c[i, j ]  
 max( c[i, j  1], c[i  1, j ]) otherwise
• A match happens only when the first equation is taken
• So we can start from c[m,n] and go backwards, remember
x[i] whenever c[i,j] = c[i-1, j-1]+1.

2 2 For example, here


2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3
Finding LCS
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3

Time for trace back: O(m+n).


Finding LCS (2)
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
LCS (reversed order): B C B
LCS (straight order): B C B
(this string turned out to be a palindrome)
A more general problem
• Aligning two strings, such that
Match = 1
Mismatch = 0 (or other scores)
Insertion/deletion = -1
• Aligning ABBC with CABC
– LCS = 3: ABC
– Best alignment

ABBC ABBC
CABC CABC
Score = 2 Score = 1
Aligment Problem
• Let F(i, j) be the best alignment score between
X[1..i] and Y[1..j].
• F(m, n) is the best alignment score between X and Y
• Recurrence

F(i-1, j-1) + (i, j) Match/Mismatch


F(i, j) = max
F(i-1,j) – 1 Insertion on Y

F(i, j-1) – 1 Insertion on X

(i, j) = 1 if X[i]=Y[j] and 0 otherwise.


Alignment Example ABBC
j 0 1 2 3 4
CABC
i Y[j] C A B C
0 X[i]
A
1

2 B

3 B

4 C

X = ABBC; m = |X| = 4
Y = CABC; n = |Y| = 4
Allocate array F[5,5]
Alignment Example ABBC
CABC
j 0 1 2 3 4
i Y[j] C A B C
X[i]
0 0 -1 -2 -3 -4
A -1 0 0 -1 -2
1

2 B -2 -1 0 1 0

3 B -3 -2 -1 1 1

4 C -4 -2 -2 0 2

F(i-1, j-1) + (i, j) Match/Mismatch


F(i, j) = max F(i-1,j) – 1 Insertion on Y
F(i, j-1) – 1 Insertion on X

(i, j) = 1 if X[i]=Y[j] and 0 otherwise.


Alignment Example ABBC
CABC
j 0 1 2 3 4
i Y[j] C A B C
X[i]
0 0 -1 -2 -3 -4
A -1 0 0 -1 -2
1

2 B -2 -1 0 1 0
ABBC
3 B -3 -2 -1 1 1 CABC
Score = 2
4 C -4 -2 -2 0 2

F(i-1, j-1) + (i, j) Match/Mismatch


F(i, j) = max F(i-1,j) – 1 Insertion on Y
F(i, j-1) – 1 Insertion on X

(i, j) = 1 if X[i]=Y[j] and 0 otherwise.


Alignment as a longest path problem

C A B C

-1
1 0
Alignment as a longest path problem

C A B C

-1
1 0

You might also like