Skip to content

Displaying the numbers of documents parsed per second #652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 30, 2020

Conversation

lemire
Copy link
Member

@lemire lemire commented Mar 30, 2020

Some users are interested, as a metric, in the number of documents parsed per second.

Obviously, this means reusing the same parser again and again. Suppose that you have a target, you want parse documents like "twitter_timeline.json" that you will receive in quick succession... how many could you parse per second?

$ ./parse jsonexamples/small/twitter_timeline.json 
....
42110.6 documents parsed per second

So 42,000.

And, of course, for small documents (say demo.json), I achieve speeds in the millions of documents per second...

$ ./parse jsonexamples/small/demo.json 
number of iterations 200 
                                                     
....

2341481.7 documents parsed per second

For an even smaller document ({"status":"success"}), I reach nearly 4 million documents per second.

The implementation is damn ugly, but I have tried to work within the "benchmarker" framework. I probably could have done a better job.

I think it would be nice, before pushing out version 0.3, to add a remark about our ability to parse millions of documents per second. But, of course, this needs to be backed by an actual benchmark.

…rsed per second.

Obviously, this means reusing the same parser again and again.
@lemire lemire added this to the 0.3 milestone Mar 30, 2020
@lemire lemire requested a review from jkeiser March 30, 2020 18:56
@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

Oh yes, that's SUPER nice and absolutely worth having a competition benchmark for. It would be immediately convincing to me. GB/s takes me time to decide whether it'll make my server faster or not.

Copy link
Member

@jkeiser jkeiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

@jkeiser I have added a sentence in the README within this PR about our ability to parse millions of JSON documents per second (on a single core).

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

I will open an issue regarding the "competition" because I don't want to start anything big right now.

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

I will open an issue regarding the "competition" because I don't want to start anything big right now.

I do wonder if we can make parsingcompetition spit out its results in documents/sec instead of gb/s, assuming it correctly takes allocation. Assuming users reuse their input buffers, it might be a reasonable proxy for requests/second.

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

@jkeiser I just hacked parsingcompetition so that it spits out documents/sec but it is not very reliable. One should do a finer job. I'll open an issue.

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

Looks good. It's a little more reliable (though not perfect) if you up the repeat multiplier to 100 with -r 100:

jkeiser@JKEISER-THINKPAD:~/simdjson$ ./parsingcompetition -v -r 100 jsonexamples/github_events.json 
Input jsonexamples/github_events.json has 63 KB: will run 10824 iterations.
getline                                         :    0.952 cycles per input byte (best)    0.997 cycles (avg)    1.986 GB/s (error margin: 0.093 GB/s)        327009 Kdocuments/s
simdjson (dynamic mem)                          :    0.653 cycles per input byte (best)    0.685 cycles (avg)    2.820 GB/s (error margin: 0.127 GB/s)        464549 Kdocuments/s
simdjson                                        :    0.646 cycles per input byte (best)    0.671 cycles (avg)    2.857 GB/s (error margin: 0.108 GB/s)        468571 Kdocuments/s
RapidJSON                                       :    9.850 cycles per input byte (best)   10.180 cycles (avg)    0.201 GB/s (error margin: 0.006 GB/s)         33131 Kdocuments/s
RapidJSON (accurate number parsing)             :    9.962 cycles per input byte (best)   10.390 cycles (avg)    0.199 GB/s (error margin: 0.008 GB/s)         32495 Kdocuments/s
RapidJSON (insitu)                              :    5.118 cycles per input byte (best)    5.311 cycles (avg)    0.385 GB/s (error margin: 0.014 GB/s)         63003 Kdocuments/s
RapidJSON (insitu, accurate number parsing)     :    5.693 cycles per input byte (best)    6.042 cycles (avg)    0.346 GB/s (error margin: 0.020 GB/s)         55366 Kdocuments/s
sajson (dynamic mem)                            :    2.082 cycles per input byte (best)    2.572 cycles (avg)    0.932 GB/s (error margin: 0.175 GB/s)        151597 Kdocuments/s
sajson                                          :    1.869 cycles per input byte (best)    1.927 cycles (avg)    1.037 GB/s (error margin: 0.032 GB/s)        170457 Kdocuments/s
nlohmann-json                                   :   19.132 cycles per input byte (best)   19.425 cycles (avg)    0.104 GB/s (error margin: 0.002 GB/s)         15832 Kdocuments/s
memcpy                                          :    0.054 cycles per input byte (best)    0.059 cycles (avg)   18.609 GB/s (error margin: 1.524 GB/s)       2925405 Kdocuments/s

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

@jkeiser Changing the number of repetitions exposes a bug in my new code for parsingcompetition. Let me check.

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

Ok. Should be good now.

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

This should be good enough for a release now.

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

@lemire any reason you're re-measuring total time instead of just using sumclockdiff?

I used min_sumclockdiff and sumclockdiff for best/avg, and came up with this:

./parsingcompetition -v -r 100 jsonexamples/github_events.json
getline                                         :    0.932 cycles per input byte (best)    0.984 cycles (avg)    2.029 GB/s (error margin: 0.112 GB/s)         31153 Kdocuments/s (best)         29426 Kdocuments/s (avg)
simdjson (dynamic mem)                          :    0.699 cycles per input byte (best)    0.763 cycles (avg)    2.648 GB/s (error margin: 0.227 GB/s)         40650 Kdocuments/s (best)         37158 Kdocuments/s (avg)
simdjson                                        :    0.694 cycles per input byte (best)    0.728 cycles (avg)    2.658 GB/s (error margin: 0.123 GB/s)         40816 Kdocuments/s (best)         38924 Kdocuments/s (avg)
RapidJSON                                       :    9.354 cycles per input byte (best)   11.148 cycles (avg)    0.212 GB/s (error margin: 0.034 GB/s)          3249 Kdocuments/s (best)          2726 Kdocuments/s (avg)
RapidJSON (accurate number parsing)             :   11.232 cycles per input byte (best)   13.076 cycles (avg)    0.176 GB/s (error margin: 0.025 GB/s)          2706 Kdocuments/s (best)          2324 Kdocuments/s (avg)
RapidJSON (insitu)                              :    6.779 cycles per input byte (best)    8.268 cycles (avg)    0.291 GB/s (error margin: 0.053 GB/s)          4466 Kdocuments/s (best)          3660 Kdocuments/s (avg)
RapidJSON (insitu, accurate number parsing)     :    6.861 cycles per input byte (best)    9.041 cycles (avg)    0.287 GB/s (error margin: 0.069 GB/s)          4413 Kdocuments/s (best)          3347 Kdocuments/s (avg)
sajson (dynamic mem)                            :    2.585 cycles per input byte (best)    3.342 cycles (avg)    0.751 GB/s (error margin: 0.169 GB/s)         11534 Kdocuments/s (best)          8932 Kdocuments/s (avg)
sajson                                          :    2.255 cycles per input byte (best)    2.729 cycles (avg)    0.857 GB/s (error margin: 0.149 GB/s)         13158 Kdocuments/s (best)         10875 Kdocuments/s (avg)
nlohmann-json                                   :   25.042 cycles per input byte (best)   27.819 cycles (avg)    0.079 GB/s (error margin: 0.008 GB/s)          1217 Kdocuments/s (best)          1095 Kdocuments/s (avg)
memcpy                                          :    0.058 cycles per input byte (best)    0.072 cycles (avg)   14.803 GB/s (error margin: 1.029 GB/s)        227273 Kdocuments/s (best)        211471 Kdocuments/s (avg)

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

    if (verbose)                                                               \
      printf(" %13.0f Kdocuments/s (best)", 1.0/min_sumclockdiff);             \
    if (verbose)                                                               \
      printf(" %13.0f Kdocuments/s (avg)", 1.0/(sumclockdiff/repeat));                 \

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

Round numbers like this are a lot easier to appreciate in terms of benchmarks, too :)

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

Let us do it your way.

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

For twitter:

./parsingcompetition -v -r 100 jsonexamples/twitter.json 
Input jsonexamples/twitter.json has 616 KB: will run 1116 iterations.
getline                                         :    1.102 cycles per input byte (best)    1.148 cycles (avg)    1.798 GB/s (error margin: 0.072 GB/s)          2847 Kdocuments/s (best)          2732 Kdocuments/s (avg)
simdjson (dynamic mem)                          :    0.766 cycles per input byte (best)    0.852 cycles (avg)    2.580 GB/s (error margin: 0.264 GB/s)          4085 Kdocuments/s (best)          3668 Kdocuments/s (avg)
simdjson                                        :    0.763 cycles per input byte (best)    0.797 cycles (avg)    2.589 GB/s (error margin: 0.110 GB/s)          4100 Kdocuments/s (best)          3927 Kdocuments/s (avg)
RapidJSON                                       :    8.930 cycles per input byte (best)    9.403 cycles (avg)    0.223 GB/s (error margin: 0.011 GB/s)           353 Kdocuments/s (best)           335 Kdocuments/s (avg)
RapidJSON (accurate number parsing)             :    9.183 cycles per input byte (best)    9.488 cycles (avg)    0.217 GB/s (error margin: 0.007 GB/s)           343 Kdocuments/s (best)           332 Kdocuments/s (avg)
RapidJSON (insitu)                              :    5.797 cycles per input byte (best)    5.992 cycles (avg)    0.343 GB/s (error margin: 0.011 GB/s)           543 Kdocuments/s (best)           526 Kdocuments/s (avg)
RapidJSON (insitu, accurate number parsing)     :    5.832 cycles per input byte (best)    6.060 cycles (avg)    0.341 GB/s (error margin: 0.013 GB/s)           540 Kdocuments/s (best)           520 Kdocuments/s (avg)
sajson (dynamic mem)                            :    2.355 cycles per input byte (best)    2.428 cycles (avg)    0.843 GB/s (error margin: 0.025 GB/s)          1335 Kdocuments/s (best)          1295 Kdocuments/s (avg)
sajson                                          :    1.931 cycles per input byte (best)    2.021 cycles (avg)    1.028 GB/s (error margin: 0.046 GB/s)          1628 Kdocuments/s (best)          1555 Kdocuments/s (avg)
nlohmann-json                                   :   19.732 cycles per input byte (best)   22.202 cycles (avg)    0.101 GB/s (error margin: 0.011 GB/s)           160 Kdocuments/s (best)           142 Kdocuments/s (avg)
memcpy                                          :    0.070 cycles per input byte (best)    0.073 cycles (avg)   25.988 GB/s (error margin: 1.035 GB/s)         41152 Kdocuments/s (best)         39513 Kdocuments/s (avg)

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

4 million documents/s has such a nice ring to it! RapidJSON's best time is 540 thousand, sajson has a respectable 1.5 mil.

I am still utterly boggled that we're faster than getline .......

@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

I have updated the code so that it does what @jkeiser's suggested. I like how we have two numbers (avg,best) so we can get an idea as to whether the numbers can be trusted.

I still think that this whole benchmark will fail you on small files, and it is easy to fix, but I don't want to go there today.

@jkeiser
Copy link
Member

jkeiser commented Mar 30, 2020

Looks right to me! +1

The x64 CI perf check is going really haywire recently. I don't think it's your change.

@lemire lemire merged commit 21dce6c into master Mar 30, 2020
@lemire
Copy link
Member Author

lemire commented Mar 30, 2020

Merged.

@lemire lemire deleted the dlemire/reportnumberofdocspersecond branch April 3, 2020 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants