0% found this document useful (0 votes)
0 views

algorithm_lecture 07_LCS

The document discusses the Longest Common Subsequence (LCS) algorithm, which is used to compare two sequences, such as DNA strings. It details the brute-force approach and the more efficient dynamic programming method to find the length of the LCS, explaining the recursive solution and providing an example with sequences X and Y. The algorithm's complexity and the process of filling a table to derive the LCS length are also outlined.

Uploaded by

tachbirdewan.bd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

algorithm_lecture 07_LCS

The document discusses the Longest Common Subsequence (LCS) algorithm, which is used to compare two sequences, such as DNA strings. It details the brute-force approach and the more efficient dynamic programming method to find the length of the LCS, explaining the recursive solution and providing an example with sequences X and Y. The algorithm's complexity and the process of filling a table to derive the LCS length are also outlined.

Uploaded by

tachbirdewan.bd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

Algorithms

Dynamic programming
Longest Common Subsequence

01/21/25 1
Longest Common Subsequence (LCS)

Application: comparison of two DNA strings


Ex: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X= AB C B DAB
Y= BDCA B A
Brute force algorithm would compare each
subsequence of X with the symbols in Y

01/21/25 2
LCS Algorithm
• if |X| = m, |Y| = n, then there are 2m
subsequences of x; we must compare each
with Y (n comparisons)
• So the running time of the brute-force
algorithm is O(n 2m)
• Notice that the LCS problem has optimal
substructure: solutions of subproblems are
parts of the final solution.
• Subproblems: “find LCS of pairs of prefixes
of X and Y”

01/21/25 3
LCS Algorithm
• First we’ll find the length of LCS. Later we’ll
modify the algorithm to find LCS itself.
• Define Xi, Yj to be the prefixes of X and Y of
length i and j respectively
• Define c[i,j] to be the length of LCS of Xi and
Yj
• Then the length of LCS of X and Y will be
c[m,n] c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

01/21/25 4
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• We start with i = j = 0 (empty substrings of x


and y)
• Since X0 and Y0 are empty strings, their LCS
is always empty (i.e. c[0,0] = 0)
• LCS of empty string and any other string is
empty, so for every i and j: c[0, j] = c[i,0] = 0
01/21/25 5
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• When we calculate c[i,j], we consider two


cases:
• First case: x[i]=y[j]: one more symbol in
strings X and Y matches, so the length of LCS
Xi and Yj equals to the length of LCS of

01/21/25
smaller strings Xi-1 and Yi-1 , plus 1 6
LCS recursive solution
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

• Second case: x[i] != y[j]


• As symbols don’t match, our solution is not
improved, and the length of LCS(Xi , Yj) is
the same as before (i.e. maximum of LCS(X i,
Yj-1) and LCS(Xi-1,Yj)

01/21/25
Why not just take the length of LCS(X i-1, Yj-1) ? 7
LCS Length Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
01/21/25 8
LCS Example
We’ll see how LCS algorithm works on the
following example:
• X = ABCB
• Y = BDCAB

What is the Longest Common Subsequence


of X and Y?

LCS(X, Y) = BCB
X=AB C B
01/21/25 Y= BDCAB 9
ABCB
LCS Example (0) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
A
1
2 B

3 C

4 B

X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,4]
01/21/25 10
ABCB
LCS Example (1) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0
2 B
0
3 C 0
4 B 0

for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
01/21/25 11
ABCB
LCS Example (2) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 12
ABCB
LCS Example (3) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 13
ABCB
LCS Example (4) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 14
ABCB
LCS Example (5) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 15
ABCB
LCS Example (6) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 16
ABCB
LCS Example (7) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 17
ABCB
LCS Example (8) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 18
ABCB
LCS Example (10) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 19
ABCB
LCS Example (11) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 20
ABCB
LCS Example (12) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 21
ABCB
LCS Example (13) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 22
ABCB
LCS Example (14) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 23
ABCB
LCS Example (15) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
01/21/25 24
LCS Algorithm Running Time

• LCS algorithm calculates the values of each


entry of the array c[m,n]
• So what is the running time?

O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
01/21/25 25
How to find actual LCS
• So far, we have just found the length of LCS,
but not LCS itself.
• We want to modify this algorithm to make it
output Longest Common Subsequence of X
and Y
Each c[i,j] depends on c[i-1,j] and c[i,j-1]
or c[i-1, j-1]
For each c[i,j] we can say how it was acquired:
2 2 For example, here
2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3
01/21/25 26
How to find actual LCS - continued
• Remember that
c[i  1, j  1]  1 if x[i ]  y[ j ],
c[i, j ] 
 max(c[i, j  1], c[i  1, j ]) otherwise

 So we can start from c[m,n] and go backwards


 Whenever c[i,j] = c[i-1, j-1]+1, remember
x[i] (because x[i] is a part of LCS)
 When i=0 or j=0 (i.e. we reached the
beginning), output remembered letters in
reverse order
01/21/25 27
Finding LCS
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3

01/21/25 28
Finding LCS (2)
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi 0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
LCS (reversed order): B C B
LCS (straight order): B C B
01/21/25 (this string turned out to be a palindrome) 29

You might also like