Copyright © 2014 - 2024 by cp-algorithms contributors +edit_uri: edit/main/src/ +copyright: Text is available under the Creative Commons Attribution Share Alike 4.0 International License
Copyright © 2014 - 2025 by cp-algorithms contributors extra_javascript: - javascript/config.js - - https://polyfill.io/v3/polyfill.min.js?features=es6 + - https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?features=es6 - https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js extra_css: - stylesheets/extra.css @@ -57,12 +56,13 @@ markdown_extensions: permalink: true plugins: + - toggle-sidebar: + toggle_button: all - mkdocs-simple-hooks: hooks: on_env: "hooks:on_env" - search - - tags: - tags_file: tags.md + - tags - literate-nav: nav_file: navigation.md - git-revision-date-localized: @@ -74,6 +74,7 @@ plugins: docs_path: src/ token: !ENV MKDOCS_GIT_COMMITTERS_APIKEY enabled: !ENV [MKDOCS_ENABLE_GIT_COMMITTERS, False] + branch: !ENV [MKDOCS_GIT_COMMITERS_BRANCH, main] - macros - rss diff --git a/scripts/install-mkdocs.sh b/scripts/install-mkdocs.sh index 4202c9a97..9c4e73c3f 100755 --- a/scripts/install-mkdocs.sh +++ b/scripts/install-mkdocs.sh @@ -2,6 +2,7 @@ pip install \ "mkdocs-material>=9.0.2" \ + mkdocs-toggle-sidebar-plugin \ mkdocs-macros-plugin \ mkdocs-literate-nav \ mkdocs-git-authors-plugin \ diff --git a/src/algebra/bit-manipulation.md b/src/algebra/bit-manipulation.md index 54c42f4d6..f29810323 100644 --- a/src/algebra/bit-manipulation.md +++ b/src/algebra/bit-manipulation.md @@ -186,6 +186,44 @@ int countSetBits(int n) } ``` +### Count set bits upto $n$ +To count the number of set bits of all numbers upto the number $n$ (inclusive), we can run the Brian Kernighan's algorithm on all numbers upto $n$. But this will result in a "Time Limit Exceeded" in contest submissions. + +We can use the fact that for numbers upto $2^x$ (i.e. from $1$ to $2^x - 1$) there are $x \cdot 2^{x-1}$ set bits. This can be visualised as follows. +``` +0 -> 0 0 0 0 +1 -> 0 0 0 1 +2 -> 0 0 1 0 +3 -> 0 0 1 1 +4 -> 0 1 0 0 +5 -> 0 1 0 1 +6 -> 0 1 1 0 +7 -> 0 1 1 1 +8 -> 1 0 0 0 +``` + +We can see that the all the columns except the leftmost have $4$ (i.e. $2^2$) set bits each, i.e. upto the number $2^3 - 1$, the number of set bits is $3 \cdot 2^{3-1}$. + +With the new knowledge in hand we can come up with the following algorithm: + +- Find the highest power of $2$ that is lesser than or equal to the given number. Let this number be $x$. +- Calculate the number of set bits from $1$ to $2^x - 1$ by using the formula $x \cdot 2^{x-1}$. +- Count the no. of set bits in the most significant bit from $2^x$ to $n$ and add it. +- Subtract $2^x$ from $n$ and repeat the above steps using the new $n$. + +```cpp +int countSetBits(int n) { + int count = 0; + while (n > 0) { + int x = std::bit_width(n) - 1; + count += x << (x - 1); + n -= 1 << x; + count += n + 1; + } + return count; +} +``` + ### Additional tricks - $n ~\&~ (n + 1)$ clears all trailing ones: $0011~0111_2 \rightarrow 0011~0000_2$. diff --git a/src/algebra/euclid-algorithm.md b/src/algebra/euclid-algorithm.md index 18638a74c..2de931871 100644 --- a/src/algebra/euclid-algorithm.md +++ b/src/algebra/euclid-algorithm.md @@ -15,7 +15,7 @@ $$\gcd(a, b) = \max \{k > 0 : (k \mid a) \text{ and } (k \mid b) \}$$ When one of the numbers is zero, while the other is non-zero, their greatest common divisor, by definition, is the second number. When both numbers are zero, their greatest common divisor is undefined (it can be any arbitrarily large number), but it is convenient to define it as zero as well to preserve the associativity of $\gcd$. Which gives us a simple rule: if one of the numbers is zero, the greatest common divisor is the other number. -The Euclidean algorithm, discussed below, allows to find the greatest common divisor of two numbers $a$ and $b$ in $O(\log \min(a, b))$. +The Euclidean algorithm, discussed below, allows to find the greatest common divisor of two numbers $a$ and $b$ in $O(\log \min(a, b))$. Since the function is **associative**, to find the GCD of **more than two numbers**, we can do $\gcd(a, b, c) = \gcd(a, \gcd(b, c))$ and so forth. The algorithm was first described in Euclid's "Elements" (circa 300 BC), but it is possible that the algorithm has even earlier origins. @@ -70,7 +70,7 @@ Moreover, it is possible to show that the upper bound of this theorem is optimal Given that Fibonacci numbers grow exponentially, we get that the Euclidean algorithm works in $O(\log \min(a, b))$. -Another way to estimate the complexity is to notice that $a \bmod b$ for the case $a \geq b$ is at least $2$ times smaller than $a$, so the larger number is reduced at least in half on each iteration of the algorithm. +Another way to estimate the complexity is to notice that $a \bmod b$ for the case $a \geq b$ is at least $2$ times smaller than $a$, so the larger number is reduced at least in half on each iteration of the algorithm. Applying this reasoning to the case when we compute the GCD of the set of numbers $a_1,\dots,a_n \leq C$, this also allows us to estimate the total runtime as $O(n + \log C)$, rather than $O(n \log C)$, since every non-trivial iteration of the algorithm reduces the current GCD candidate by at least a factor of $2$. ## Least common multiple @@ -126,3 +126,4 @@ E.g. C++17 has such a function `std::gcd` in the `numeric` header. ## Practice Problems - [CSAcademy - Greatest Common Divisor](https://csacademy.com/contest/archive/task/gcd/) +- [Codeforces 1916B - Two Divisors](https://codeforces.com/contest/1916/problem/B) diff --git a/src/algebra/extended-euclid-algorithm.md b/src/algebra/extended-euclid-algorithm.md index 4e96f3729..3e845e728 100644 --- a/src/algebra/extended-euclid-algorithm.md +++ b/src/algebra/extended-euclid-algorithm.md @@ -95,9 +95,32 @@ int gcd(int a, int b, int& x, int& y) { If you look closely at the variables `a1` and `b1`, you can notice that they take exactly the same values as in the iterative version of the normal [Euclidean algorithm](euclid-algorithm.md). So the algorithm will at least compute the correct GCD. -To see why the algorithm also computes the correct coefficients, you can check that the following invariants will hold at any time (before the while loop, and at the end of each iteration): $x \cdot a + y \cdot b = a_1$ and $x_1 \cdot a + y_1 \cdot b = b_1$. -It's trivial to see, that these two equations are satisfied at the beginning. -And you can check that the update in the loop iteration will still keep those equalities valid. +To see why the algorithm computes the correct coefficients, consider that the following invariants hold at any given time (before the while loop begins and at the end of each iteration): + +$$x \cdot a + y \cdot b = a_1$$ + +$$x_1 \cdot a + y_1 \cdot b = b_1$$ + +Let the values at the end of an iteration be denoted by a prime ($'$), and assume $q = \frac{a_1}{b_1}$. From the [Euclidean algorithm](euclid-algorithm.md), we have: + +$$a_1' = b_1$$ + +$$b_1' = a_1 - q \cdot b_1$$ + +For the first invariant to hold, the following should be true: + +$$x' \cdot a + y' \cdot b = a_1' = b_1$$ + +$$x' \cdot a + y' \cdot b = x_1 \cdot a + y_1 \cdot b$$ + +Similarly for the second invariant, the following should hold: + +$$x_1' \cdot a + y_1' \cdot b = a_1 - q \cdot b_1$$ + +$$x_1' \cdot a + y_1' \cdot b = (x - q \cdot x_1) \cdot a + (y - q \cdot y_1) \cdot b$$ + +By comparing the coefficients of $a$ and $b$, the update equations for each variable can be derived, ensuring that the invariants are maintained throughout the algorithm. + At the end we know that $a_1$ contains the GCD, so $x \cdot a + y \cdot b = g$. Which means that we have found the required coefficients. diff --git a/src/algebra/factorial-modulo.md b/src/algebra/factorial-modulo.md index 73ff0145b..86961ba33 100644 --- a/src/algebra/factorial-modulo.md +++ b/src/algebra/factorial-modulo.md @@ -14,7 +14,7 @@ Otherwise $p!$ and subsequent terms will reduce to zero. But in fractions the factors of $p$ can cancel, and the resulting expression will be non-zero modulo $p$. Thus, formally the task is: You want to calculate $n! \bmod p$, without taking all the multiple factors of $p$ into account that appear in the factorial. -Imaging you write down the prime factorization of $n!$, remove all factors $p$, and compute the product modulo $p$. +Imagine you write down the prime factorization of $n!$, remove all factors $p$, and compute the product modulo $p$. We will denote this *modified* factorial with $n!_{\%p}$. For instance $7!_{\%p} \equiv 1 \cdot 2 \cdot \underbrace{1}_{3} \cdot 4 \cdot 5 \underbrace{2}_{6} \cdot 7 \equiv 2 \bmod 3$. diff --git a/src/algebra/factorization.md b/src/algebra/factorization.md index 58a43b961..14715605f 100644 --- a/src/algebra/factorization.md +++ b/src/algebra/factorization.md @@ -159,11 +159,10 @@ By looking at the squares $a^2$ modulo a fixed small number, it can be observed ## Pollard's $p - 1$ method { data-toc-label="Pollard's method" } -It is very likely that at least one factor of a number is $B$**-powersmooth** for small $B$. -$B$-powersmooth means that every prime power $d^k$ that divides $p-1$ is at most $B$. +It is very likely that a number $n$ has at least one prime factor $p$ such that $p - 1$ is $\mathrm{B}$**-powersmooth** for small $\mathrm{B}$. An integer $m$ is said to be $\mathrm{B}$-powersmooth if every prime power dividing $m$ is at most $\mathrm{B}$. Formally, let $\mathrm{B} \geqslant 1$ and let $m$ be any positive integer. Suppose the prime factorization of $m$ is $m = \prod {q_i}^{e_i}$, where each $q_i$ is a prime and $e_i \geqslant 1$. Then $m$ is $\mathrm{B}$-powersmooth if, for all $i$, ${q_i}^{e_i} \leqslant \mathrm{B}$. E.g. the prime factorization of $4817191$ is $1303 \cdot 3697$. -And the factors are $31$-powersmooth and $16$-powersmooth respectably, because $1303 - 1 = 2 \cdot 3 \cdot 7 \cdot 31$ and $3697 - 1 = 2^4 \cdot 3 \cdot 7 \cdot 11$. -In 1974 John Pollard invented a method to extracts $B$-powersmooth factors from a composite number. +And the values, $1303 - 1$ and $3697 - 1$, are $31$-powersmooth and $16$-powersmooth respectively, because $1303 - 1 = 2 \cdot 3 \cdot 7 \cdot 31$ and $3697 - 1 = 2^4 \cdot 3 \cdot 7 \cdot 11$. +In 1974 John Pollard invented a method to extract factors $p$, s.t. $p-1$ is $\mathrm{B}$-powersmooth, from a composite number. The idea comes from [Fermat's little theorem](phi-function.md#application). Let a factorization of $n$ be $n = p \cdot q$. @@ -180,7 +179,7 @@ This means that $a^M - 1 = p \cdot r$, and because of that also $p ~|~ \gcd(a^M Therefore, if $p - 1$ for a factor $p$ of $n$ divides $M$, we can extract a factor using [Euclid's algorithm](euclid-algorithm.md). -It is clear, that the smallest $M$ that is a multiple of every $B$-powersmooth number is $\text{lcm}(1,~2~,3~,4~,~\dots,~B)$. +It is clear, that the smallest $M$ that is a multiple of every $\mathrm{B}$-powersmooth number is $\text{lcm}(1,~2~,3~,4~,~\dots,~B)$. Or alternatively: $$M = \prod_{\text{prime } q \le B} q^{\lfloor \log_q B \rfloor}$$ @@ -189,11 +188,11 @@ Notice, if $p-1$ divides $M$ for all prime factors $p$ of $n$, then $\gcd(a^M - In this case we don't receive a factor. Therefore, we will try to perform the $\gcd$ multiple times, while we compute $M$. -Some composite numbers don't have $B$-powersmooth factors for small $B$. -For example, the factors of the composite number $100~000~000~000~000~493 = 763~013 \cdot 131~059~365~961$ are $190~753$-powersmooth and $1~092~161~383$-powersmooth. -We will have to choose $B >= 190~753$ to factorize the number. +Some composite numbers don't have factors $p$ s.t. $p-1$ is $\mathrm{B}$-powersmooth for small $\mathrm{B}$. +For example, for the composite number $100~000~000~000~000~493 = 763~013 \cdot 131~059~365~961$, values $p-1$ are $190~753$-powersmooth and $1~092~161~383$-powersmooth correspondingly. +We will have to choose $B \geq 190~753$ to factorize the number. -In the following implementation we start with $B = 10$ and increase $B$ after each each iteration. +In the following implementation we start with $\mathrm{B} = 10$ and increase $\mathrm{B}$ after each each iteration. ```{.cpp file=factorization_p_minus_1} long long pollards_p_minus_1(long long n) { @@ -246,7 +245,9 @@ If $p$ is smaller than $\sqrt{n}$, the repetition will likely start in $O(\sqrt[ Here is a visualization of such a sequence $\{x_i \bmod p\}$ with $n = 2206637$, $p = 317$, $x_0 = 2$ and $f(x) = x^2 + 1$. From the form of the sequence you can see very clearly why the algorithm is called Pollard's $\rho$ algorithm. -






























**Note**: This item uses the term "_walk_" rather than a "_path_" for a reason, as the vertices may potentially repeat in the found walk in order to make its length even. The problem of finding the shortest _path_ of even length is NP-Complete in directed graphs, and [solvable in linear time](https://onlinelibrary.wiley.com/doi/abs/10.1002/net.3230140403) in undirected graphs, but with a much more involved approach. ## Practice Problems diff --git a/src/graph/bridge-searching-online.md b/src/graph/bridge-searching-online.md index 702dc5a88..80ed25ead 100644 --- a/src/graph/bridge-searching-online.md +++ b/src/graph/bridge-searching-online.md @@ -174,7 +174,6 @@ int find_cc(int v) { } void make_root(int v) { - v = find_2ecc(v); int root = v; int child = -1; while (v != -1) { diff --git a/src/graph/bridge-searching.md b/src/graph/bridge-searching.md index 2d6596878..48a52c1f6 100644 --- a/src/graph/bridge-searching.md +++ b/src/graph/bridge-searching.md @@ -22,25 +22,34 @@ Pick an arbitrary vertex of the graph $root$ and run [depth first search](depth- Now we have to learn to check this fact for each vertex efficiently. We'll use "time of entry into node" computed by the depth first search. -So, let $tin[v]$ denote entry time for node $v$. We introduce an array $low$ which will let us check the fact for each vertex $v$. $low[v]$ is the minimum of $tin[v]$, the entry times $tin[p]$ for each node $p$ that is connected to node $v$ via a back-edge $(v, p)$ and the values of $low[to]$ for each vertex $to$ which is a direct descendant of $v$ in the DFS tree: +So, let $\mathtt{tin}[v]$ denote entry time for node $v$. We introduce an array $\mathtt{low}$ which will let us store the node with earliest entry time found in the DFS search that a node $v$ can reach with a single edge from itself or its descendants. $\mathtt{low}[v]$ is the minimum of $\mathtt{tin}[v]$, the entry times $\mathtt{tin}[p]$ for each node $p$ that is connected to node $v$ via a back-edge $(v, p)$ and the values of $\mathtt{low}[to]$ for each vertex $to$ which is a direct descendant of $v$ in the DFS tree: -$$low[v] = \min \begin{cases} tin[v] \\ tin[p]& \text{ for all }p\text{ for which }(v, p)\text{ is a back edge} \\ low[to]& \text{ for all }to\text{ for which }(v, to)\text{ is a tree edge} \end{cases}$$ +$$\mathtt{low}[v] = \min \left\{ + \begin{array}{l} + \mathtt{tin}[v] \\ + \mathtt{tin}[p] &\text{ for all }p\text{ for which }(v, p)\text{ is a back edge} \\ + \mathtt{low}[to] &\text{ for all }to\text{ for which }(v, to)\text{ is a tree edge} + \end{array} +\right\}$$ -Now, there is a back edge from vertex $v$ or one of its descendants to one of its ancestors if and only if vertex $v$ has a child $to$ for which $low[to] \leq tin[v]$. If $low[to] = tin[v]$, the back edge comes directly to $v$, otherwise it comes to one of the ancestors of $v$. +Now, there is a back edge from vertex $v$ or one of its descendants to one of its ancestors if and only if vertex $v$ has a child $to$ for which $\mathtt{low}[to] \leq \mathtt{tin}[v]$. If $\mathtt{low}[to] = \mathtt{tin}[v]$, the back edge comes directly to $v$, otherwise it comes to one of the ancestors of $v$. -Thus, the current edge $(v, to)$ in the DFS tree is a bridge if and only if $low[to] > tin[v]$. +Thus, the current edge $(v, to)$ in the DFS tree is a bridge if and only if $\mathtt{low}[to] > \mathtt{tin}[v]$. ## Implementation The implementation needs to distinguish three cases: when we go down the edge in DFS tree, when we find a back edge to an ancestor of the vertex and when we return to a parent of the vertex. These are the cases: -- $visited[to] = false$ - the edge is part of DFS tree; -- $visited[to] = true$ && $to \neq parent$ - the edge is back edge to one of the ancestors; +- $\mathtt{visited}[to] = false$ - the edge is part of DFS tree; +- $\mathtt{visited}[to] = true$ && $to \neq parent$ - the edge is back edge to one of the ancestors; - $to = parent$ - the edge leads back to parent in DFS tree. To implement this, we need a depth first search function which accepts the parent vertex of the current node. -```cpp +For the cases of multiple edges, we need to be careful when ignoring the edge from the parent. To solve this issue, we can add a flag `parent_skipped` which will ensure we only skip the parent once. + +```{.cpp file=bridge_searching_offline} +void IS_BRIDGE(int v,int to); // some function to process the found bridge int n; // number of nodes vector



















+


*In the image left is the MST and right is the second best MST.* - +




+A visual representation of simulated annealing, searching for the maxima of this function with multiple local maxima. +
+



