Update suffix-array.md for Linear Time approach #1369

PeroxideParadox · 2024-10-16T11:57:07Z

The previous version of suffix Array did not include the Linear time algorithm , I have modified the same using the Skew Algorithm to introduce the O(n) time complexity approach
It will benefit competitive programmers by optimizing time-critical problems.

Reference: Skew Algorithm Paper

adamant-pwn · 2024-10-24T15:01:41Z

Thanks! I'm not very familiar with linear-time suffix array construction algorithms, but random places over the internet claim that SA-iS is simpler and faster than Kärkkäinen-Sanders, is that true? I wonder if we should describe SA-IS instead of DC3... 🤔

PeroxideParadox · 2024-10-24T18:09:02Z

Thank you for the feedback! You're absolutely right , While the DC3 algorithm is efficient for large inputs and widely used when memory consumption is a constraint, I agree that SA-IS is simpler and faster in many practical scenarios, especially in competitive programming. Given this, I’m happy to switch the implementation to SA-IS, as it better aligns with the goals of this repository.

Would you prefer I update the pull request to describe SA-IS instead of DC3, and I can add that?

adamant-pwn · 2024-10-24T18:56:03Z

Ideally we'd rely on some benchmarks, for example implement both algorithms (in an as simple way as possible) and see how they fare on Library Checker. Then, if the performance turns out to be a close match, we should describe the one that is simpler to understand, otherwise we should probably focus on the one with better performance.

PeroxideParadox · 2024-10-26T09:44:08Z

on benchmarks SA-IS performs better than DC3 and is easier to implement as well

So let's switch to SA-IS and let me update the pull request
Is that good ?

adamant-pwn · 2024-10-30T03:10:02Z

Yes, let's do it then.

PeroxideParadox · 2024-10-30T07:35:26Z

I have updated the Pull Request with the latest changes as discussed

mhayter · 2025-04-11T23:22:25Z

It seems like this slipped through the cracks
Is this ready to go?

adamant-pwn

Hi, thanks for the pull request, and sorry for the late review!

I only reviewed part of it, but I think we should add some explanation, in text, of what constitutes the algorithm (not just tldr), and also some proper explanation on why it is linear, so I think it makes sense to return to review once it is done.

Also a lot of itemized/numerated lists misrender because of missing newlines before the first list item, which should be fixed.

Also, so that I better understand this pull request, could you please tell if you have used AI when making it (to which extent, if yes), and if the code implementation is your own?

adamant-pwn · 2025-04-18T21:15:46Z

src/string/suffix-array.md

+
+### Understanding L-Type and S-Type Suffixes
+
+In the SA-IS algorithm, suffixes are classified as:


Please check your suggestion against the preview. Lists require a newline before the first item, otherwise they misrender.

adamant-pwn · 2025-04-18T21:16:43Z

src/string/suffix-array.md

+- **S-type (Right)**: A suffix is S-type if it is lexicographically smaller than or equal to the suffix immediately following it.
+
+For the string `banana$`, let's classify each suffix:
+1. Starting from the end, `$` is considered S-type by definition.


This also requires a newline before the first list item.

adamant-pwn · 2025-04-18T21:17:34Z

src/string/suffix-array.md

+| 3        | `ana$`  | S    |
+| 2        | `nana$` | L    |
+| 1        | `anana$`| S    |
+| 0        | `banana$`| S    |


I think banana$ is actually larger than anana$?

adamant-pwn · 2025-04-18T21:18:15Z

src/string/suffix-array.md

+The `L` and `S` types provide crucial information for sorting suffixes using induced sorting.
+### Example:
+For the string `s = "banana"`, The steps were :
+1. The algorithm first classifies suffixes into L-type and S-type.


This misses the newline.

adamant-pwn · 2025-04-18T21:19:08Z

src/string/suffix-array.md

+### Example:
+For the string `s = "banana"`, The steps were :
+1. The algorithm first classifies suffixes into L-type and S-type.
+2. It identifies LMS positions based on the L and S classifications.


Intro text says "the steps were", but the example above actually doesn't contain or explain any further steps.

adamant-pwn · 2025-04-18T21:22:15Z

src/string/suffix-array.md

+       if (n == 2) return (s[0] < s[1]) ? std::vector<int>{0, 1} : std::vector<int>{1, 0};
+   ```
+   This part initializes the `sa_is` function, which constructs the suffix array for a given input vector `s`. It first handles edge cases:
+   - If the string is empty (`n == 0`), it returns an empty array.


Newline needed before the first list item.

adamant-pwn · 2025-04-18T21:23:28Z

src/string/suffix-array.md

+
+### 1. **Suffix Array Initialization and Base Cases**
+   ```cpp
+   std::vector<int> sa_is(const std::vector<int>& s, int upper) {


What is upper? It appears to be max possible character, but it's not explicitly stated anywhere.

adamant-pwn · 2025-04-18T21:27:23Z

src/string/suffix-array.md

+           if (!ls[i]) sum_s[s[i]]++;
+           else sum_l[s[i] + 1]++;


Shouldn't it be the other way around? ls[i] = true $\iff$ the suffix at $i$ is smaller than the suffix at $i+1$, so it means it is $S$-suffix?

adamant-pwn · 2025-04-18T21:28:13Z

src/string/suffix-array.md

+2. **Identifying LMS Substrings**:
+   - SA-IS identifies LMS (Leftmost S-type) positions, which are boundaries between L-type and S-type suffixes. LMS substrings are critical as they serve as anchor points for the sorting process.
+
+3. **Induced Sorting**:


I think we really need to add further explanation, in text, of how exactly induced sorting works, and why recursive calls would be linear.

adamant-pwn · 2025-04-18T21:28:35Z

src/string/suffix-array.md

+3. **Induced Sorting**:
+   - The algorithm first sorts LMS substrings recursively, then induces the order of the remaining suffixes. Sorting the LMS substrings is the key part of the algorithm, and once sorted, the remaining suffixes are arranged by their lexicographical order relative to LMS substrings.
+
+4. **Recursive Call for LMS Substrings**:


Isn't this already included in the previous step?..

mhayter · 2025-06-28T19:01:52Z

@PeroxideParadox we'd love to publish your article after these conflicts are resolved!

Update suffix-array.md

2bd61ca

github-actions bot added a commit that referenced this pull request Oct 23, 2024

Preview for #1369 (2bd61ca) at https://gh.cp-algorithms.com/1369/

a8c9be5

Update suffix-array.md for Linear Time approach (SA-IS Algo)

67e5aaf

mhayter closed this Apr 11, 2025

github-actions bot added a commit that referenced this pull request Apr 11, 2025

Delete preview for #1369

e991568

mhayter reopened this Apr 11, 2025

github-actions bot added a commit that referenced this pull request Apr 11, 2025

Preview for #1369 (67e5aaf) at https://gh.cp-algorithms.com/1369/

c6f6cc3

mhayter requested a review from adamant-pwn April 11, 2025 23:23

Merge branch 'main' into patch-1

3a4fe0d

mhayter added the supposedly resolved label Apr 17, 2025

github-actions bot added a commit that referenced this pull request Apr 17, 2025

Preview for #1369 (3a4fe0d) at https://gh.cp-algorithms.com/1369/

2535885

adamant-pwn requested changes Apr 18, 2025

View reviewed changes

adamant-pwn removed the supposedly resolved label Apr 18, 2025

mhayter added the enhancement label Jun 28, 2025


		### Understanding L-Type and S-Type Suffixes

		In the SA-IS algorithm, suffixes are classified as:

Uh oh!

Update suffix-array.md for Linear Time approach #1369

Are you sure you want to change the base?

Update suffix-array.md for Linear Time approach #1369

Uh oh!

Conversation

PeroxideParadox commented Oct 16, 2024

Uh oh!

adamant-pwn commented Oct 24, 2024

Uh oh!

PeroxideParadox commented Oct 24, 2024

Uh oh!

adamant-pwn commented Oct 24, 2024

Uh oh!

PeroxideParadox commented Oct 26, 2024

Uh oh!

adamant-pwn commented Oct 30, 2024

Uh oh!

PeroxideParadox commented Oct 30, 2024

Uh oh!

mhayter commented Apr 11, 2025

Uh oh!

adamant-pwn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhayter commented Jun 28, 2025

Uh oh!

Uh oh!