Improve Sieve of Eratosthenes article: expanded explanations, added code and comments for odd-only and optimized sieves, and corrected mathematical typos. #1483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

MythOfSisyphus wants to merge 2 commits into cp-algorithms:main from MythOfSisyphus:sieve-refactor

MythOfSisyphus commented Jul 23, 2025

Expanded key explanations (e.g. why sieving starts from p*p and how ceil(L/p)*p avoids marking the prime itself)
Added well-commented code for:
- Odd-only sieve
- Segmented sieve
- Prime count via blocks
- Sieve over arbitrary range [L, R]
Corrected mathematical typos in writing intervals.


          Improve Sieve of Eratosthenes article: expanded explanations, added c…

5b5b06c

…ode and comments for odd-only and optimized sieves, and corrected mathematical typos.

github-actions bot added a commit that referenced this pull request


          Preview for #1483 (5b5b06c) at https://gh.cp-algorithms.com/1483/

db0be48

Contributor

mhayter commented Aug 1, 2025

I'm confused on some of these changes:

Why is x changed to n as a variable? We already defined n.

Why are we using limit now and not defining it?

Author

MythOfSisyphus commented Aug 2, 2025 •

edited

Loading

"Why is $x$ changed to $n$ as a variable? We already defined $n$." Oh, yeah! I didn't realize it. I took $n$ because it is more natural choice to denote an integer. But yeah, as we've written in first line "... finding prime numbers in range $[1, n]$". We could either stick with the original $x$ or use a different variable name like $m$ for clarity.

"Why are we using limit now and not defining it?" I'm really sorry for that. Yeah, in section Sieving till root I wrote j <= limit, in second for loop, without defining limit. Actually, it should be j <= n. I didn't notice it because I used limit to denote square root of $n$ and $R$ in Segmented Sieve.

Thank you very much for closely reviewing my all changes.
I’ll be happy to fix both issues and push an updated commit to this branch — unless you'd prefer to handle the minor edits directly before merging.

Contributor

mhayter commented Aug 2, 2025 •

edited

Loading

I'm not convinced we will be merging. Please fix these issues and proofread your request. I haven't thoroughly reviewed by your request yet because of two obvious issues.

Author

MythOfSisyphus commented Aug 3, 2025

I don't know what you're finding hard to understand. If you read original article here after that read my refactored version here you'll clearly see how much explanation and readability in the code I improved.

Basically, I edited the whole article. If you have read cp-algorithm closely you'll realize that "code is very poor and explanations are not great." That's what I tried to improve in this very file.

If after reading both articles you still find yourself confused, then close my pull request (OfCourse without merging). I'll be happy to keep my changes locally in my machine only.

As my title and description are very what I did with the file.

mhayter closed this

github-actions bot added a commit that referenced this pull request


          Delete preview for #1483

b27e633

adamant-pwn reopened this

adamant-pwn requested changes

View reviewed changes

Member

adamant-pwn left a comment

Hi, thanks for the pull request! I left some comments where it seemed appropriate, and I think we would need to address them before considering merging.

I'm also conflicted about significantly rewriting code blocks, as we currently don't have tests for them. We should either add some tests (see here), or at least make sure that they were tested in some actual competitive programming problems.

src/algebra/sieve-of-eratosthenes.md

    
              The algorithm is very simple:

              at the beginning we write down all numbers between 2 and $n$.

              We mark all proper multiples of 2 (since 2 is the smallest prime number) as composite.

              A proper multiple of a number $x$, is a number greater than $x$ and divisible by $x$.

              A proper multiple of a number $n$, is a number greater than $n$ and divisible by $n$.

Member

adamant-pwn Aug 9, 2025

Suggested change

      
            A proper multiple of a number $n$, is a number greater than $n$ and divisible by $n$.
          
            A proper multiple of a number $m$, is a number greater than $n$ and divisible by $m$.

$n$ is already used.

src/algebra/sieve-of-eratosthenes.md

Comment on lines +101 to +102

		To understand it more clearly please checkout RobJohn's [this answer](https://math.stackexchange.com/a/58808) to the question "Why in Sieve of Eratosthenes of $N$
		number you need to check and cross out numbers up to $\sqrt{N}$?" on [MathStackExchange](https://math.stackexchange.com).

Member

adamant-pwn Aug 9, 2025

Suggested change

      
            To understand it more clearly please checkout RobJohn's [this answer](https://math.stackexchange.com/a/58808) to the question "Why in Sieve of Eratosthenes of $N$
          
            number you need to check and cross out numbers up to $\sqrt{N}$?" on [MathStackExchange](https://math.stackexchange.com).
          
            This is because the smallest prime factor $p$ of a composite number $m \leq n$ may not exceed $\sqrt m \leq \sqrt n$. Indeed, $p \cdot \frac{m}{p} = m$ and $p \leq \frac{m}{p}$, thus if $p$ was greater than $\sqrt m$, the product of $p$ and $\frac{m}{p}$ would be greater than $(\sqrt m)^2 = m$.

I think it's better to provide an inline explanation, rather than link to stackexchange, esp. given that the explanation is fairly short.

src/algebra/sieve-of-eratosthenes.md

-                      for (int j = i * i; j <= n; j += i)
-                          is_prime[j] = false;
+                  if(!is_prime[i]) continue;
+                  for (int j = i * i; j <= limit; j += i){

Member

adamant-pwn Aug 9, 2025

Suggested change

      
                for (int j = i * i; j <= limit; j += i){
          
                for (int j = i * i; j <= n; j += i){

limit is used here, but not defined anywhere. Generally, I don't think it should be different from $n$ in this particular part.

src/algebra/sieve-of-eratosthenes.md

               First, it will allow us to halve the needed memory. Second, it will reduce the number of operations performed by algorithm approximately in half.
+              ```cpp
+              // returns a vector of all prime numbers in the interval [2, n]
+              vector<int> primes_upto(int n){

Member

adamant-pwn Aug 9, 2025

Was this code tested on any particular problem? If no, could you please do it?

src/algebra/sieve-of-eratosthenes.md

               A benchmark ([link](https://gist.github.com/jakobkogler/e6359ea9ced24fe304f1a8af3c9bee0e)) shows, that using a `vector<bool>` is between 1.4x and 1.7x faster than using a `vector<char>`.
               The same considerations also apply to `bitset`.
-              It's also an efficient way of storing bits, similar to `vector<bool>`, so it takes only $\frac{N}{8}$ bytes of memory, but is a bit slower in accessing the elements.
+              It's also an efficient way of storing bits, similar to `vector<bool>`, so it takes only $\frac{N}{8}$ bytes of memory, but is a bit slower in accessing the elements means `vector<bool>` is a space-optimized specialization that typically uses 1 bit per element, unlike standard containers like `vector<char>` or `vector<bool>` (if it were a normal template), which use 1 byte per element.

Member

adamant-pwn Aug 9, 2025

Suggested change

      
            It's also an efficient way of storing bits, similar to `vector<bool>`, so it takes only $\frac{N}{8}$ bytes of memory, but is a bit slower in accessing the elements means `vector<bool>` is a space-optimized specialization that typically uses 1 bit per element, unlike standard containers like `vector<char>` or `vector<bool>` (if it were a normal template), which use 1 byte per element.
          
            It's also an efficient way of storing bits, similar to `vector<bool>`, so it takes only $\frac{N}{8}$ bytes of memory, but is a bit slower in accessing the elements.

This paragraph is about bitset, not vector<bool>. The explanation in this particular paragraph seems out of place, moreover I believe the same information about vector<bool> is already conveyed above, on lines 159-160.

src/algebra/sieve-of-eratosthenes.md

               }
               ```
+              #### Note on 'start'

Member

adamant-pwn Aug 9, 2025

I'm not sure it warrants a dedicated section. Also when your additions ends, it is followed by some general stuff that is not from this particular section, which might be confusing as to where we stop talking about start and start discussing the block sieve in general.

src/algebra/sieve-of-eratosthenes.md

-                          primes.push_back(i);
-                          for (int j = i * i; j <= nsqrt; j += i)
-                              is_prime[j] = false;
+                  vector<bool> is_prime(limit + 1, true);

Member

adamant-pwn Aug 9, 2025

As we are now no longer limited by the cache speeds, we can replace the vector<bool> with a vector<char>

The usage of vector<char> rather than vector<bool> was intentional.

src/algebra/sieve-of-eratosthenes.md

               We pre-generate all prime numbers up to $\sqrt R$, and use those primes to mark all composite numbers in the segment $[L, R]$.
               ```cpp
-              vector<char> segmentedSieve(long long L, long long R) {
+              vector<bool> segmentedSieve(long long L, long long R) {

Member

adamant-pwn Aug 9, 2025

As we are now no longer limited by the cache speeds, we can replace the vector<bool> with a vector<char>

The usage of vector<char> rather than vector<bool> was intentional.

src/algebra/sieve-of-eratosthenes.md

               We pre-generate all prime numbers up to $\sqrt R$, and use those primes to mark all composite numbers in the segment $[L, R]$.
               ```cpp
-              vector<char> segmentedSieve(long long L, long long R) {
+              vector<bool> segmentedSieve(long long L, long long R) {

Member

adamant-pwn Aug 9, 2025

Did you test the new version on any particular problem? If no, could you please do it?

src/algebra/sieve-of-eratosthenes.md

-                  long long lim = sqrt(R);
-                  for (long long i = 2; i <= lim; ++i)
-                      for (long long j = max(i * i, (L + i - 1) / i * i); j <= R; j += i)
+              vector<bool> segmentedSieveNoPreGen(long long L, long long R) {

Member

adamant-pwn Aug 9, 2025

Did you test the new version on any particular problem? If no, could you please do it?


          Merge branch 'main' into sieve-refactor

35640bb

github-actions bot added a commit that referenced this pull request


          Preview for #1483 (5b5b06c) at https://gh.cp-algorithms.com/1483/

a0743b4

github-actions bot added a commit that referenced this pull request


          Preview for #1483 (35640bb) at https://gh.cp-algorithms.com/1483/

1fa208f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet