Skip to content

gh-125196: Use PyUnicodeWriter for JSON encoder #125249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 10, 2024

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 10, 2024

Replace the private _PyUnicodeWriter with the public PyUnicodeWriter.

Replace the private _PyUnicodeWriter with the public PyUnicodeWriter.
return NULL;
}
PyObject *result = PyTuple_Pack(1, str);
Copy link
Contributor

@eendebakpt eendebakpt Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this new code is cleaner than the old code. But PyTuple_Pack is a bit slow (see #118222), so we should measure whether this change makes an impact. Since this is not in one of the inner loops, I suspect the impact is really small.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below. It's not faster, it's not slower. The change has no significant effect on performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that PyTuple_Pack() is the performance bottleneck of this function.

I chose to use PyTuple_Pack() to avoid creating an incomplete tuple and then fill it. I prefer an atomic API which creates directly valid tuple.

@vstinner
Copy link
Member Author

Benchmark:

import pyperf
import json
runner = pyperf.Runner()
runner.bench_func('dumps [1,2,3]', json.dumps, [1, 2, 3])
runner.bench_func("dumps ('baz', None, 1.0, 2)", json.dumps, ('baz', None, 1.0, 2))
runner.bench_func('dumps {"c": 0, ...}', json.dumps, {"c": 0, "b": 0, "a": 0})

Result, Python built with gcc -O3, CPU isolation:

+-----------------------------+---------+-----------------------+
| Benchmark                   | ref     | change                |
+=============================+=========+=======================+
| dumps [1,2,3]               | 1.57 us | 1.53 us: 1.02x faster |
+-----------------------------+---------+-----------------------+
| dumps ('baz', None, 1.0, 2) | 1.62 us | 1.68 us: 1.04x slower |
+-----------------------------+---------+-----------------------+
| Geometric mean              | (ref)   | 1.00x slower          |
+-----------------------------+---------+-----------------------+

Benchmark hidden because not significant (1): dumps {"c": 0, ...}

@vstinner vstinner merged commit c914212 into python:main Oct 10, 2024
37 checks passed
@vstinner vstinner deleted the writer_json branch October 10, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants