Skip to content

bpo-37587: Make json.loads faster for long strings #14752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 30, 2019

Conversation

mpaolini
Copy link
Contributor

@mpaolini mpaolini commented Jul 13, 2019

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

@mangrisano
Copy link
Contributor

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c is only used in the condition if (!(c == '"' || c == '\\')) after the loop. It duplicates the condition in the loop. Maybe move it in the loop and change to if (next == len)?

@mpaolini
Copy link
Contributor Author

mpaolini commented Jul 13, 2019

@serhiy-storchaka c is also used later on in the if (c == '"') test . Anyays that MOV turned out to be a red herring, what made the difference whas the strict check removal.

Thanks to @pablogsal and @zooba for helping me!

mpaolini added 2 commits July 13, 2019 21:08
Forces the compiler to use a register variable for a tight loop
in the hot-path.

It also optimizes a condition for the common case strict=true.
json.loads hot path.

This partially reverts the previous commit as we found out
the culprit wasn't the MOV but the strictness checks
Modules/_json.c Outdated
/* Defer the strict error until outside this (hot) loop. */
/* See bpo-37587 */
if (c <= 0x1f && invalid < 0) {
invalid = next;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unlikely that an ordering comparison could be faster than a check against zero, so I suspect most or all of the speed-up here is coming from performing the usually-false check before the usually-true one.

What performance numbers do you get with just the reordering of the operands to put the character check first?

My other question would be to ask what happens to the assembly output if you explicitly mark "strict" as const with the original code? (the while loop body is long enough that I doubt compilers will be scanning it to see if "strict" is ever actually modified, so the explicit hint may make a difference)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just flipping around the condition makes it 11% faster than vanilla. moving the goto out of the loop gives you another 1% (maybe just because the loop is smaller)

Copy link
Contributor

@ncoghlan ncoghlan Jul 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd suggest changing the revised check to be:

if (c <= 0x1f && strict) {
    invalid = next;
    break;
}

And then not re-checking strict outside the loop - just check for invalid >= 0.

(The currently posted version will be much slower when it comes to rejecting bad data, as it will continue a doomed scan, potentially until the end of the file, instead of exiting as soon as it sees an invalid character)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being slower when rejecting bad data is okay though, since it's going to result in an exception.

Our "even faster" version defers the actual check until after the loop completes, and we keep the minimum seen c throughout the loop (which uses a branchless cmov instruction on Intel-like architectures). That way we only check strict once.

And I believe you're spot on - the performance improvement from simply switching comes because the first condition being false bypasses a couple of branches, whereas before they were being included, and when strict==0 it should make no difference. But avoiding the additional branching entirely is even better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last edition looks incorrect to me. Shouldn't break be added after invalid = next?

I am fine with just swapping strict and c <= 0x1f if it gives a noticeable effect (10% or like). We can try more subtle changes (with effect ~ 1%) in a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we removed the break on purpose to make the loop as small as possible, the algorithm is still correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan done as you suggested, just switched around the strict check

@pablogsal
Copy link
Member

@mpaolini Can you attach here some of the benchmarks we did together with @zooba and @tiran on EuroPython?

@mpaolini
Copy link
Contributor Author

@pablogsal some of the benchmarks we did during the sprint are attached to the issue here https://bugs.python.org/issue37587

@ncoghlan
Copy link
Contributor

Thanks @mpaolini!

@miss-islington
Copy link
Contributor

Thanks @mpaolini for the PR, and @ncoghlan for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-15022 is a backport of this pull request to the 3.8 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 30, 2019
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.

This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
(cherry picked from commit 8a758f5)

Co-authored-by: Marco Paolini <mpaolini@users.noreply.github.com>
miss-islington added a commit that referenced this pull request Jul 30, 2019
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.

This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
(cherry picked from commit 8a758f5)

Co-authored-by: Marco Paolini <mpaolini@users.noreply.github.com>
lisroach pushed a commit to lisroach/cpython that referenced this pull request Sep 10, 2019
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.

This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
DinoV pushed a commit to DinoV/cpython that referenced this pull request Jan 14, 2020
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.

This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
websurfer5 pushed a commit to websurfer5/cpython that referenced this pull request Jul 20, 2020
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.

This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants