-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-37587: Make json.loads faster for long strings #14752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c
is only used in the condition if (!(c == '"' || c == '\\'))
after the loop. It duplicates the condition in the loop. Maybe move it in the loop and change to if (next == len)
?
@serhiy-storchaka Thanks to @pablogsal and @zooba for helping me! |
Forces the compiler to use a register variable for a tight loop in the hot-path. It also optimizes a condition for the common case strict=true.
json.loads hot path. This partially reverts the previous commit as we found out the culprit wasn't the MOV but the strictness checks
Modules/_json.c
Outdated
/* Defer the strict error until outside this (hot) loop. */ | ||
/* See bpo-37587 */ | ||
if (c <= 0x1f && invalid < 0) { | ||
invalid = next; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems unlikely that an ordering comparison could be faster than a check against zero, so I suspect most or all of the speed-up here is coming from performing the usually-false check before the usually-true one.
What performance numbers do you get with just the reordering of the operands to put the character check first?
My other question would be to ask what happens to the assembly output if you explicitly mark "strict" as const with the original code? (the while loop body is long enough that I doubt compilers will be scanning it to see if "strict" is ever actually modified, so the explicit hint may make a difference)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just flipping around the condition makes it 11% faster than vanilla. moving the goto out of the loop gives you another 1% (maybe just because the loop is smaller)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, I'd suggest changing the revised check to be:
if (c <= 0x1f && strict) {
invalid = next;
break;
}
And then not re-checking strict outside the loop - just check for invalid >= 0
.
(The currently posted version will be much slower when it comes to rejecting bad data, as it will continue a doomed scan, potentially until the end of the file, instead of exiting as soon as it sees an invalid character)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being slower when rejecting bad data is okay though, since it's going to result in an exception.
Our "even faster" version defers the actual check until after the loop completes, and we keep the minimum seen c
throughout the loop (which uses a branchless cmov
instruction on Intel-like architectures). That way we only check strict
once.
And I believe you're spot on - the performance improvement from simply switching comes because the first condition being false bypasses a couple of branches, whereas before they were being included, and when strict==0
it should make no difference. But avoiding the additional branching entirely is even better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last edition looks incorrect to me. Shouldn't break
be added after invalid = next
?
I am fine with just swapping strict
and c <= 0x1f
if it gives a noticeable effect (10% or like). We can try more subtle changes (with effect ~ 1%) in a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we removed the break
on purpose to make the loop as small as possible, the algorithm is still correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ncoghlan done as you suggested, just switched around the strict
check
@pablogsal some of the benchmarks we did during the sprint are attached to the issue here https://bugs.python.org/issue37587 |
Thanks @mpaolini! |
GH-15022 is a backport of this pull request to the 3.8 branch. |
When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing). (cherry picked from commit 8a758f5) Co-authored-by: Marco Paolini <mpaolini@users.noreply.github.com>
When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing). (cherry picked from commit 8a758f5) Co-authored-by: Marco Paolini <mpaolini@users.noreply.github.com>
When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing).
When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing).
When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing).
https://bugs.python.org/issue37587