Skip to content

Commit ba84614

Browse files
authored
On Parallel Binary Search
1 parent 820fc4a commit ba84614

File tree

1 file changed

+137
-1
lines changed

1 file changed

+137
-1
lines changed

src/num_methods/binary_search.md

Lines changed: 137 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ while (r - l > 1) {
7171

7272
During the execution of the algorithm, we never evaluate neither $A_L$ nor $A_R$, as $L < M < R$. In the end, $L$ will be the index of the last element that is not greater than $k$ (or $-1$ if there is no such element) and $R$ will be the index of the first element larger than $k$ (or $n$ if there is no such element).
7373

74-
**Note.** Calculating `m` as `m = (r + l) / 2` can lead to overflow if `l` and `r` are two positive integers, and this error lived about 9 years in JDK as described in the [blogpost](https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html). Some alternative approaches include e.g. writing `m = l + (r - l) / 2` which always works for positive integer `l` and `r`, but might still overflow if `l` is a negative number. If you use C++20, it offers an alternative solution in the form of `m = std::midpoint(l, r)` which always works correctly.
74+
**Note.** Calculating `m` as `m = (r + l) / 2` can lead to overflow if `l` and `r` are two positive integers, and this error lived about 9 years in JDK as described in the [blogpost](https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html). Some alternative approaches include e.g. writing `m = l + (r - l) / 2` which always works for positive integer `l` and `r`, but might still overflow if `l` is a negative number. If you use C++20, it offers an alternative solution in the form of `m = midpoint(l, r)` which always works correctly.
7575

7676
## Search on arbitrary predicate
7777

@@ -138,6 +138,134 @@ Another noteworthy way to do binary search is, instead of maintaining an active
138138

139139
This paradigm is widely used in tasks around trees, such as finding lowest common ancestor of two vertices or finding an ancestor of a specific vertex that has a certain height. It could also be adapted to e.g. find the $k$-th non-zero element in a Fenwick tree.
140140

141+
## Parallel Binary Search
142+
143+
[^1] Imagine that we want to answer $Z$ queries about the index of the largest value less than or equal to some $X_i$ (for $i=1,2,\ldots,Z$) in some sorted 0-indexed array $A$. Naturally, each query can be answered using binary search.
144+
145+
Specifally, let us consider the following array $A$:
146+
147+
| $A_0$ | $A_1$ | $A_2$ | $A_3$ | $A_4$ | $A_5$ | $A_6$ | $A_7$ |
148+
|-------|-------|-------|-------|-------|-------|-------|-------|
149+
| 1 | 3 | 5 | 7 | 9 | 9 | 13 | 15 |
150+
151+
with queries: $X = [8,11,4,5]$. We can use binary search for each query sequentially.
152+
153+
<table>
154+
<tr>
155+
<th>query</th>
156+
<th>$ X_1 = 8 $</th>
157+
<th>$ X_2 = 11 $</th>
158+
<th>$ X_3 = 4 $</th>
159+
<th>$ X_4 = 5 $</th>
160+
</tr>
161+
<tr>
162+
<th rowspan="3">step 1</th>
163+
<td>answer in $[0,7]$</td>
164+
<td>answer in $[0,7]$</td>
165+
<td>answer in $[0,7]$</td>
166+
<td>answer in $[0,7]$</td>
167+
</tr>
168+
<tr>
169+
<td>check $ A_4 $</td>
170+
<td>check $ A_4 $</td>
171+
<td>check $ A_4 $</td>
172+
<td>check $ A_4 $</td>
173+
</tr>
174+
<tr>
175+
<td>$ X_1 < A_4 = 9 $</td>
176+
<td>$ X_2 \geq A_4 = 9 $</td>
177+
<td>$ X_3 < A_4 = 9 $</td>
178+
<td>$ X_4 < A_4 = 9 $</td>
179+
</tr>
180+
<tr>
181+
<th rowspan="3">step 2</th>
182+
<td>answer in $[0,3]$</td>
183+
<td>answer in $[4,7]$</td>
184+
<td>answer in $[0,3]$</td>
185+
<td>answer in $[0,3]$</td>
186+
</tr>
187+
<tr>
188+
<td>check $ A_2 $</td>
189+
<td>check $ A_6 $</td>
190+
<td>check $ A_2 $</td>
191+
<td>check $ A_2 $</td>
192+
</tr>
193+
<tr>
194+
<td>$ X_1 \geq A_2 = 5 $</td>
195+
<td>$ X_2 < A_6 = 13 $</td>
196+
<td>$ X_3 < A_2 = 5 $</td>
197+
<td>$ X_4 \geq A_2 = 5 $</td>
198+
</tr>
199+
<tr>
200+
<th rowspan="3">step 3</th>
201+
<td>answer in $[2,3]$</td>
202+
<td>answer in $[4,5]$</td>
203+
<td>answer in $[0,1]$</td>
204+
<td>answer in $[2,3]$</td>
205+
</tr>
206+
<tr>
207+
<td>check $ A_3 $</td>
208+
<td>check $ A_5 $</td>
209+
<td>check $ A_1 $</td>
210+
<td>check $ A_3 $</td>
211+
</tr>
212+
<tr>
213+
<td>$ X_1 \geq A_3 = 7 $</td>
214+
<td>$ X_2 \geq A_5 = 9 $</td>
215+
<td>$ X_3 \geq A_1 = 3 $</td>
216+
<td>$ X_4 < A_3 = 7 $</td>
217+
</tr>
218+
<tr>
219+
<th rowspan="2">step 4</th>
220+
<td>answer in $[3,3]$</td>
221+
<td>answer in $[5,5]$</td>
222+
<td>answer in $[1,1]$</td>
223+
<td>answer in $[2,2]$</td>
224+
</tr>
225+
<tr>
226+
<td>$ index = 3 $</td>
227+
<td>$ index = 5 $</td>
228+
<td>$ index = 1 $</td>
229+
<td>$ index = 2 $</td>
230+
</tr>
231+
</table>
232+
233+
234+
We generally process this table by columns (queries), but notice that in each row we often repeat access to certain values of our array. To limit access to the values, we can process the table by rows (steps). This does not make huge difference in our small example problem (as we can access all elements in $\mathcal{O}(1)$), but in more complex problems, where computing these values is more complicated, this might be essential to solve these problems efficiently. Moreover, note that we can arbitrarily choose the order in which we answer questions in a single row. Let us look at the code implementing this approach.
235+
236+
```cpp
237+
// Computes the index of the largest value in table A less than or equal to $X_i$ for all $i$.
238+
vector<int> ParallelBinarySearch(vector<int>& A, vector<int>& X) {
239+
int N = A.size();
240+
int M = X.size();
241+
vector<int> P(M, -1);
242+
vector<int> Q(M, N-1);
243+
244+
for (int step = 1; step <= ceil(log2(N)); ++step) {
245+
// Map to store indices of queries asking for this value.
246+
unordered_map<int, vector<int>> important_values;
247+
248+
// Calculate mid and populate the important_values map.
249+
for (int i = 0; i < M; ++i) {
250+
int mid = (P[i] + Q[i]) / 2;
251+
important_values[mid].push_back(i);
252+
}
253+
254+
// Process each value in important_values.
255+
for (const auto& [mid, queries]: important_values) {
256+
for (int query : queries) {
257+
if (A[mid] > X[query]) {
258+
Q[query] = mid;
259+
} else {
260+
P[query] = mid;
261+
}
262+
}
263+
}
264+
}
265+
return P;
266+
}
267+
```
268+
141269
## Practice Problems
142270
143271
* [LeetCode - Find First and Last Position of Element in Sorted Array](https://leetcode.com/problems/find-first-and-last-position-of-element-in-sorted-array/)
@@ -154,3 +282,11 @@ This paradigm is widely used in tasks around trees, such as finding lowest commo
154282
* [Codeforces - GukiZ hates Boxes](https://codeforces.com/problemset/problem/551/C)
155283
* [Codeforces - Enduring Exodus](https://codeforces.com/problemset/problem/645/C)
156284
* [Codeforces - Chip 'n Dale Rescue Rangers](https://codeforces.com/problemset/problem/590/B)
285+
286+
### Parallel Binary Search
287+
288+
* [Szkopul - Meteors](https://szkopul.edu.pl/problemset/problem/7JrCYZ7LhEK4nBR5zbAXpcmM/site/?key=statement)
289+
* [AtCoder - Stamp Rally](https://atcoder.jp/contests/agc002/tasks/agc002_d)
290+
291+
292+
[^1]: Note that this section is following the description in [Sports programming in practice](https://kostka.dev/sp/).

0 commit comments

Comments
 (0)