Two-Way String Algorithm
Two-Way String Algorithm
Unlike BM and KMP, it uses only O(log m) additional space to store information
about those partial repeats: the search pattern is split into two parts (its
critical factorization), represented only by the position of that split. Being a
number less than m, it can be represented in ⌈log₂ m⌉ bits. This is sometimes
treated as "close enough to O(1) in practice", as the needle's size is limited by
the size of addressable memory; the overhead is a number that can be stored in a
single register, and treating it as O(1) is like treating the size of a loop
counter as O(1) rather than log of the number of iterations. The actual matching
operation performs at most 2n − m comparisons.[2]
The first one performs at most n + ⌊(n − m)/2⌋ comparisons, ⌈(n − m)/2⌉ fewer
than the original. It must however store ⌈log φ {\displaystyle \varphi } m⌉
additional offsets in the needle, using O(log2 m) space.
The second adapts it to only store a constant number of such offsets, denoted
c, but must perform n + ⌊(1⁄2 + ε) * (n − m)⌋ comparisons, with ε = 1⁄2(Fc+2 − 1)−1
= O( φ {\displaystyle \varphi }−c) going to zero exponentially quickly as c
increases.