Skip to content

Commit ffdf1e5

Browse files
committed
clean up bitwise manipulation article and add additional infos
1 parent 8109db2 commit ffdf1e5

File tree

1 file changed

+113
-92
lines changed

1 file changed

+113
-92
lines changed

src/algebra/bit-manipulation.md

Lines changed: 113 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,60 @@
1-
# Binary number
2-
3-
A **binary number** is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one).
4-
5-
We are talking about a numerical system of great importance in technological development. Among some of the most outstanding applications are the representation of images and texts, in the field of electronics. With the combination of zeros and ones, any number can be represented on a decimal basis, making communication with electronic devices easier for humans. In addition to translating the different programming languages, to a common language where the computer understands.
6-
7-
**In this article I will not talk about how to convert to binary or decimal. There is a lot of information all over the internet on the subject.**
8-
1+
---
2+
tags:
3+
- Original
4+
---
95
# Bit manipulation
106

11-
When finding solutions to problems, a very important factor is the simplicity and the execution time. Now, in this bitwise operations are very important and you can take advantage of them.
12-
13-
# Bit operators
14-
15-
## Bit shift operators
16-
### Multiplication and division by powers of 2
17-
18-
There are two operators for shifting bits.
19-
20-
$\gg$ One bit to the right. Which would be the same as a division by a power of two.
21-
22-
$\ll$ One bit to the left. And this would be a multiplication, equally by a power of two.
23-
24-
Let's see an example.
25-
26-
Let's say we have a number $N$, which. $N = 4$.
27-
28-
The binary representation of $4$ is:
7+
## Binary number
298

30-
$4 = 100_2$
9+
A **binary number** is a number expressed in the base-2 numeral system or binary numeral system, it is a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one).
3110

32-
If we shift one bit to the left, then we would be adding a bit to the end, in this case a 0.
11+
We say that a certain bit is **set**, if it is one, and **cleared** if it is zero.
3312

34-
$4 \ll 1 = 100_2 \ll 1 = 1000_2$
13+
The binary number $(a_k a_{k-1} \dots a_1 a_0)_2$ represents the number:
3514

36-
$1000_2 = 8$ and clearly, $4 * 2 = 8$
15+
$$(a_k a_{k-1} \dots a_1 a_0)_2 = a_k \cdot 2^k + a_{k-1} \cdot 2^{k-1} + \dots + a_1 \cdot 2^1 + a_0 \cdot 2^0.$$
3716

38-
So, for a division by two, it would be exactly the same, but using the operator $<<$.
17+
For instance the binary number $1101_2$ represents the number $13$:
3918

40-
$4 \gg 1 = 100_2 \gg 1 = 10_2$
19+
$$\begin{align}
20+
1101_2 &= 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 \\
21+
&= 1\cdot 8 + 1 \cdot 4 + 0 \cdot 2 + 1 \cdot 1 = 13
22+
\end{align}$$
4123

42-
In this case, we would remove one bit from the end, this is what is called the **least significant bit**, the rightmost bit of the representation.
24+
Computers represent integers as binary numbers.
25+
Positive integers (both signed and unsigned) are just represented with their binary digits, and negative signed numbers (which can be positive and negative) are usually represented with the [Two's complement](https://en.wikipedia.org/wiki/Two%27s_complement).
4326

44-
Of course, the **most significant bit** is the leftmost bit.
27+
```cpp
28+
unsigned int unsigned_number = 13;
29+
assert(unsigned_number == 0b1101);
4530

46-
And then, how could it carry out operations with other powers, greater than two?
31+
int positive_signed_number = 13;
32+
assert(positive_signed_number == 0b1101);
4733

48-
Easy, you just have to remember the exponent of two, to create said power. In other words:
34+
int negative_signed_number = -13;
35+
assert(negative_signed_number == 0b1111'1111'1111'1111'1111'1111'1111'0011);
36+
```
4937
50-
$4 = 2 ^ 2$, so to do multiplication or division operations with **4**, we must move **2 bits**.
38+
CPUs are very fast manipulating those bits with specific operations.
39+
For some problems we can take these binary number representations to our advantage, and speed up the execution time.
40+
And for some problems (typically in combinatorics or dynamic programming) where we want to track which objects we already picked from a given set of objects, we can just use an large enought integer where each digit represents an object and depending on if we pick or drop the object we set or clear the digit.
5141
52-
$8 = 2 ^ 3$, so for $8$ it would be **3 bits**. *And so on*.
42+
## Bit operators
5343
44+
All those introduced operators are instant (same speed as an addition) on a CPU for fixed-length integers.
5445
55-
## Bitwise operators.
46+
### Bitwise operators
5647
57-
```
58-
& : The bitwise AND operator compares each bit of its first operand with the corresponding bit of its second operand.
48+
- $\&$ : The bitwise AND operator compares each bit of its first operand with the corresponding bit of its second operand.
5949
If both bits are 1, the corresponding result bit is set to 1. Otherwise, the corresponding result bit is set to 0.
6050
61-
| : The bitwise inclusive OR operator compares each bit of its first operand with the corresponding bit of its second operand.
51+
- $|$ : The bitwise inclusive OR operator compares each bit of its first operand with the corresponding bit of its second operand.
6252
If one of the two bits is 1, the corresponding result bit is set to 1. Otherwise, the corresponding result bit is set to 0.
6353
64-
^ : The bitwise exclusive OR (XOR) operator compares each bit of its first operand with the corresponding bit of its second operand.
54+
- $\wedge$ : The bitwise exclusive OR (XOR) operator compares each bit of its first operand with the corresponding bit of its second operand.
6555
If one bit is 0 and the other bit is 1, the corresponding result bit is set to 1. Otherwise, the corresponding result bit is set to 0.
66-
```
56+
57+
- $\sim$ : The bitwise complement (NOT) operator flips each bit of a number, if a bit is set the operator will clear it, if it is cleared the operator sets it.
6758
6859
Examples:
6960
@@ -88,70 +79,79 @@ n-1 = 01010111
8879
n ^ (n-1) = 00001111
8980
```
9081
91-
# Useful tricks.
82+
```
83+
n = 01011000
84+
--------------------
85+
~n = 10100111
86+
```
9287
93-
## A bit is set (1) or cleared (0)
88+
### Shift operators
9489
95-
The value of the $x$-th bit can be by shifting the number $x$ positions to the right, the $x$-th bit is the units place, therefore we can extract it by performing a bitwise & with 1
90+
There are two operators for shifting bits.
91+
92+
- $\gg$ Shifts a number to the right by removing the last few binary digits of the number.
93+
Each shift by one represents an integer division by 2, so a right shift by $k$ represents an integer division by $2^k$.
94+
95+
E.g. $5 \gg 2 = 101_2 \gg 2 = 1_2 = 1$ which is the same as $\frac{5}{2^2} = \frac{5}{4} = 1$.
96+
For a computer though shifting some bits is a lot faster than doing divisions.
97+
98+
- $\ll$ Shifts a number to left by appending zero digits.
99+
In similar fashion to a right shift by $k$, a left shift by $k$ represents a multiplication by $2^k$.
100+
101+
E.g. $5 \ll 3 = 101_2 \ll 3 = 101000_2 = 40$ which is the same as $5 \cdot 2^3 = 5 \cdot 8 = 40$.
102+
103+
Notice however that for a fixed-length integer that means dropping the most left digits, and if you shift too much you end up with the number $0$.
96104
97-
``` cpp
98-
bool check(int number, int x) {
99-
return ((number >> x) & 1);
100-
}
101-
```
102105
103-
## Parity of a number
106+
## Useful tricks
104107
105-
Function to know if a number is even or odd.
108+
### Set/flip/clear a bit
106109
107-
True if number is odd, false for number is even.
110+
Using bitwise shifts and some basic bitwise operations we can easily set, flip or clear a bit.
111+
$1 \ll x$ is a number with only the $x$-th bit set, while $\sim(1 \ll x)$ is a number with all bits set except the $x$-th bit.
112+
113+
- $n ~|~ (1 \ll x)$ sets the $x$-th bit in the number $n$
114+
- $n ~\wedge~ (1 \ll x)$ flips the $x$-th bit in the number $n$
115+
- $n ~\&~ \sim(1 \ll x)$ clears the $x$-th bit in the number $n$
116+
117+
### Check if a bit is set
118+
119+
The value of the $x$-th bit can be by shifting the number $x$ positions to the right, the $x$-th bit is the units place, therefore we can extract it by performing a bitwise & with 1
108120
109121
``` cpp
110-
bool check(int number) {
111-
return (number & 1);
122+
bool is_set(unsigned int number, int x) {
123+
return (number >> x) & 1;
112124
}
113125
```
114126

115-
## Check if an integer is a power of 2
127+
### Check if an integer is a power of 2
128+
129+
A power of two is a number that has only a single bit in it (e.g. $32 = 0010~0000_2$), while the predecessor of that number has that digit not set and all the digits after it set ($31 = 0001~1111_2$).
130+
So the bitwise AND of a number with it's predecessor will always be 0, as they don't have any common digits set.
131+
You can easily check that this only happens for the the power of twos and for the number $0$ which already has no digit set.
116132

117133
``` cpp
118-
bool powerTwo = n && !(n & (n - 1));
134+
bool isPowerOfTwo(unsigned int n) {
135+
return n && !(n & (n - 1));
136+
}
119137
```
120138
121-
## Clear the most-right set bit
139+
### Clear the most-right set bit
122140
123-
The expression **n & (n-1)** can be used to turn off the rightmost set bit of a number **n**. This works like the expression **n-1** flips all bits after the rightmost set bit of n, including the rightmost set bit. Therefore, **n & (n-1)** returns the last flipped bit of **n**.
141+
The expression $n ~\&~ (n-1)$ can be used to turn off the rightmost set bit of a number $n$.
142+
This works because the expression $n-1$ flips all bits after the rightmost set bit of $n$, including the rightmost set bit.
143+
So all those digits are different from the original number, and by doing a bitwise AND they are all set to 0, giving you the original number $n$ with the rightmost set bit flipped.
124144
125-
For example, consider the number 52, which is 00110100 in binary, and has a total set of 3 bits.
145+
For example, consider the number $52 = 0011~0100_2$:
126146
127147
```
128-
1st iteration of the loop: n = 52
129-
130-
00110100 & (n)
131-
00110011 (n-1)
132-
~~~~~~~~
133-
00110000
134-
```
135-
136-
```
137-
2nd iteration of the loop: n = 48
138-
139-
00110000 & (n)
140-
00101111 (n-1)
141-
~~~~~~~~
142-
00100000
143-
```
144-
145-
```
146-
3rd iteration of the loop: n = 32
147-
148-
00100000 & (n)
149-
00011111 (n-1)
150-
~~~~~~~~
151-
00000000 (n = 0)
148+
n = 00110100
149+
n-1 = 00110011
150+
--------------------
151+
n & (n-1) = 00110000
152152
```
153153
154-
## Brian Kernighan's algorithm.
154+
### Brian Kernighan's algorithm
155155
156156
We can count the number of bits set with the above expression.
157157
@@ -161,21 +161,42 @@ The idea is to consider only the set bits of an integer by turning off its right
161161
int countSetBits(int n)
162162
{
163163
int count = 0;
164-
165164
while (n)
166165
{
167166
n = n & (n - 1);
168167
count++;
169168
}
170-
171169
return count;
172170
}
173171
```
174172

175-
Additionally, there are also predefined functions in C++, to count the number of bits in a representation.
173+
### Addtional tricks
174+
175+
- $n ~\&~ (n + 1)$ clears all trailing ones: $0011~0111_2 \rightarrow 0011~0000_2$.
176+
- $n ~|~ (n + 1)$ sets the last cleared bit: $0011~0101_2 \rightarrow 0011~0111_2$.
177+
- $n ~\&~ -n$ extracts the last set bit: $0011~0100_2 \rightarrow 0000~0100_2$.
178+
179+
Many more can be found in the book [Hacker's Delight](https://en.wikipedia.org/wiki/Hacker%27s_Delight).
180+
181+
### Language and compiler support
182+
183+
C++ supports some of those operations since C++20 via the [bit](https://en.cppreference.com/w/cpp/header/bit) standard library:
184+
185+
- `has_single_bit`: checks if the number is a power of two
186+
- `bit_ceil` / `bit_floor`: round up/down to the next power of two
187+
- `rotl` / `rotr`: rotate the bits in the number
188+
- `countl_zero` / `countr_zero` / `countl_one` / `countr_one`: count the leading/trailing zeros/ones
189+
- `popcount`: count the number of set bits
190+
191+
Additionally, there are also predefined functions in some compilers that help working with bits.
192+
E.g. GCC defines a list at [Built-in Functions Provided by GCC](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) that also work in older versions of C++:
193+
194+
- `__builtin_popcount(unsigned int)` returns the number of set bits (`__builtin_popcount(0b0001'0010'1100) == 4`)
195+
- `__builtin_ffs(int)` finds the index of the first (most right) set bit (`__builtin_ffs(0b0001'0010'1100) == 3`)
196+
- `__builtin_clz(unsigned int)` the count of leading zeros (`__builtin_clz(0b0001'0010'1100) == 23`)
197+
- `__builtin_ctz(unsigned int)` the count of trailing zeros (`__builtin_ctz(0b0001'0010'1100) == 2`)
176198

177-
1. ```__builtin_popcount(number)``` number of bits set.
178-
2. ```__builtin_ctz(number)``` number of bits cleared.
199+
_Note that some of the operations (both the C++20 functions and the Compiler Built-in ones) might be quite slow in GCC if you don't enable a specific compiler target with `#pragma GCC target("popcnt")`._
179200

180201
## Practice Problems
181202

0 commit comments

Comments
 (0)