Skip to content

Commit acc62aa

Browse files
Mrityunjai SinghMrityunjai Singh
Mrityunjai Singh
authored and
Mrityunjai Singh
committed
aho corasick text change
1 parent 4685762 commit acc62aa

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

src/string/aho_corasick.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ Accordingly, a trie for a set of strings is a trie such that each $\text{output}
3636
We now describe how to construct a trie for a given set of strings in linear time with respect to their total length.
3737

3838
We introduce a structure for the vertices of the tree:
39+
3940
```{.cpp file=aho_corasick_trie_definition}
4041
const int K = 26;
4142

@@ -209,7 +210,7 @@ Assume that at the moment we stand in a vertex $v$ and consider a character $c$.
209210
1. $go[v][c] = -1$. In this case, we may assign $go[v][c] = go[u][c]$, which is already known by the induction hypothesis;
210211
2. $go[v][c] = w \neq -1$. In this case, we may assign $link[w] = go[u][c]$.
211212
212-
In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(mk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $m$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(m \log k)$ algorithm.
213+
In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(mk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $m$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(n \log k)$ algorithm.
213214
214215
## Applications
215216
@@ -277,4 +278,5 @@ Thus we can find such a path using depth first search (and if the search looks
277278
- [CodeChef - TWOSTRS](https://www.codechef.com/MAY20A/problems/TWOSTRS)
278279
279280
## References
281+
280282
- [Stanford's CS166 - Aho-Corasick Automata](http://web.stanford.edu/class/archive/cs/cs166/cs166.1166/lectures/02/Slides02.pdf) ([Condensed](http://web.stanford.edu/class/archive/cs/cs166/cs166.1166/lectures/02/Small02.pdf))

0 commit comments

Comments
 (0)