Skip to content

Commit 32afcd2

Browse files
authored
Better documentation for issue 70 (simdjson#638)
1 parent 47859f3 commit 32afcd2

File tree

2 files changed

+38
-1
lines changed

2 files changed

+38
-1
lines changed

doc/performance.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,3 +152,16 @@ If you wish to forcefully disable computed gotos, you can do so by compiling the
152152
`-DSIMDJSON_NO_COMPUTED_GOTO=1`. It is not recommended to disable computed gotos if your compiler
153153
supports it. In fact, you should almost never need to be concerned with computed gotos.
154154

155+
Number parsing
156+
--------------
157+
158+
Some JSON files contain many floating-point values. It is the case with many GeoJSON files. Accurately
159+
parsing decimal strings into binary floating-point values with proper rounding is challenging. To
160+
our knowledge, it is not possible, in general, to parse streams of numbers at gigabytes per second
161+
using a single core. While using the simdjson library, it is possible that you might be limited to a
162+
few hundred megabytes per second if your JSON documents are densely packed with floating-point values.
163+
164+
165+
- When possible, you should favor integer values written without a decimal point, as it simpler and faster to parse decimal integer values.
166+
- When serializing numbers, you should not use more digits than necessary: 17 digits is all that is needed to exactly represent double-precision floating-point numbers. Using many more digits than necessary will make your files larger and slower to parse.
167+
- When benchmarking parsing speeds, always report whether your JSON documents are made mostly of floating-point numbers when it is the case, since number parsing can then dominate the parsing time.
Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,28 @@
11
Files from https://github.com/plokhotnyuk/jsoniter-scala/tree/master/jsoniter-scala-benchmark/src/main/resources/com/github/plokhotnyuk/jsoniter_scala/benchmark
22

3-
See issue "Lower performance on small files":
3+
See issue:
44
https://github.com/lemire/simdjson/issues/70
5+
6+
The files che-*.geo.json are number-parsing stress tests.
7+
8+
```
9+
$ for i in *.json ; do echo $i; ./parsingcompetition $i ; done
10+
che-1.geo.json
11+
simdjson : 4.841 cycles per input byte (best) 4.880 cycles per input byte (avg) 0.689 GB/s (error margin: 0.005 GB/s)
12+
RapidJSON (accurate number parsing) : 18.326 cycles per input byte (best) 19.185 cycles per input byte (avg) 0.185 GB/s (error margin: 0.008 GB/s)
13+
RapidJSON (insitu, accurate number parsing) : 18.158 cycles per input byte (best) 18.957 cycles per input byte (avg) 0.187 GB/s (error margin: 0.008 GB/s)
14+
nlohmann-json : 90.423 cycles per input byte (best) 91.077 cycles per input byte (avg) 0.038 GB/s (error margin: 0.000 GB/s)
15+
16+
che-2.geo.json
17+
simdjson : 4.849 cycles per input byte (best) 4.882 cycles per input byte (avg) 0.687 GB/s (error margin: 0.005 GB/s)
18+
RapidJSON (accurate number parsing) : 18.248 cycles per input byte (best) 19.197 cycles per input byte (avg) 0.186 GB/s (error margin: 0.009 GB/s)
19+
RapidJSON (insitu, accurate number parsing) : 18.178 cycles per input byte (best) 18.951 cycles per input byte (avg) 0.186 GB/s (error margin: 0.008 GB/s)
20+
nlohmann-json : 91.483 cycles per input byte (best) 91.842 cycles per input byte (avg) 0.037 GB/s (error margin: 0.000 GB/s)
21+
22+
che-3.geo.json
23+
simdjson : 4.862 cycles per input byte (best) 4.892 cycles per input byte (avg) 0.686 GB/s (error margin: 0.004 GB/s)
24+
RapidJSON (accurate number parsing) : 18.316 cycles per input byte (best) 19.202 cycles per input byte (avg) 0.185 GB/s (error margin: 0.008 GB/s)
25+
RapidJSON (insitu, accurate number parsing) : 18.143 cycles per input byte (best) 18.957 cycles per input byte (avg) 0.187 GB/s (error margin: 0.008 GB/s)
26+
nlohmann-json : 91.462 cycles per input byte (best) 91.758 cycles per input byte (avg) 0.037 GB/s (error margin: 0.000 GB/s)
27+
```
28+

0 commit comments

Comments
 (0)