Skip to content

Commit 4d4ed92

Browse files
authored
Removes 5 KB of tables in the number parsing routine (simdjson#1139)
* Removes 5 KB of tables at the expense, and a load, at the expense of a multiplication and a shift. I have not benchmarked this new code, but my expectation is that it should be largely performance neutral. The motivation is to reduce the size of the library slightly. There is also a matter of elegance.
1 parent 4c11652 commit 4d4ed92

File tree

2 files changed

+676
-977
lines changed

2 files changed

+676
-977
lines changed

src/generic/stage2/numberparsing.h

Lines changed: 33 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -90,14 +90,39 @@ simdjson_really_inline bool compute_float_64(int64_t power, uint64_t i, bool neg
9090

9191
// We are going to need to do some 64-bit arithmetic to get a more precise product.
9292
// We use a table lookup approach.
93-
components c =
94-
power_of_ten_components[power - FASTFLOAT_SMALLEST_POWER];
95-
// safe because
96-
// power >= FASTFLOAT_SMALLEST_POWER
97-
// and power <= FASTFLOAT_LARGEST_POWER
98-
// we recover the mantissa of the power, it has a leading 1. It is always
93+
// It is safe because
94+
// power >= FASTFLOAT_SMALLEST_POWER
95+
// and power <= FASTFLOAT_LARGEST_POWER
96+
// We recover the mantissa of the power, it has a leading 1. It is always
9997
// rounded down.
100-
uint64_t factor_mantissa = c.mantissa;
98+
uint64_t factor_mantissa = mantissa_64[power - FASTFLOAT_SMALLEST_POWER];
99+
100+
// The exponent is 1024 + 63 + power
101+
// + floor(log(5**power)/log(2)).
102+
// The 1024 comes from the ieee64 standard.
103+
// The 63 comes from the fact that we use a 64-bit word.
104+
//
105+
// Computing floor(log(5**power)/log(2)) could be
106+
// slow. Instead we use a fast function.
107+
//
108+
// For power in (-400,350), we have that
109+
// (((152170 + 65536) * power ) >> 16);
110+
// is equal to
111+
// floor(log(5**power)/log(2)) + power
112+
//
113+
// The 65536 is (1<<16) and corresponds to
114+
// (65536 * power) >> 16 ---> power
115+
//
116+
// ((152170 * power ) >> 16) is equal to
117+
// floor(log(5**power)/log(2))
118+
//
119+
// Note that this is not magic: 152170/(1<<16) is
120+
// approximatively equal to log(5)/log(2).
121+
// The 1<<16 value is a power of two; we could use a
122+
// larger power of 2 if we wanted to.
123+
//
124+
int64_t exponent = (((152170 + 65536) * power) >> 16) + 1024 + 63;
125+
101126

102127
// We want the most significant bit of i to be 1. Shift if needed.
103128
int lz = leading_zeroes(i);
@@ -188,7 +213,7 @@ simdjson_really_inline bool compute_float_64(int64_t power, uint64_t i, bool neg
188213
lz--; // undo previous addition
189214
}
190215
mantissa &= ~(1ULL << 52);
191-
uint64_t real_exponent = c.exp - lz;
216+
uint64_t real_exponent = exponent - lz;
192217
// we have to check that real_exponent is in range, otherwise we bail out
193218
if (simdjson_unlikely((real_exponent < 1) || (real_exponent > 2046))) {
194219
return false;

0 commit comments

Comments
 (0)