json: Optimize escaping string in Encoder #133186

methane · 2025-04-30T07:23:55Z

No description provided.

methane · 2025-04-30T08:11:44Z

without --enable-optimizations:

https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_json_dumps/run_benchmark.py

Mean +- std dev: [main] 9.25 ms +- 0.07 ms -> [patched] 7.68 ms +- 0.03 ms: 1.20x faster

Modules/_json.c

mdboom · 2025-04-30T15:23:31Z

I'm going to benchmark this on pyperformance on the Faster CPython infrastructure and report back in a couple of hours.

nineteendo · 2025-04-30T15:27:09Z

I benchmarked this feature on my own library and I'm a bit worried. Strings without escapes are faster, but strings with escapes are a lot slower:

encode	json (setuptools)	jsonyx (2.2.1)	reference time
List of 256 ASCII strings	1.00x	0.89x	49.97 μs
List of 256 dicts with 1 int	1.00x	1.02x	90.40 μs
Medium complex object	1.00x	1.06x	138.32 μs
List of 256 strings	1.00x	0.91x	310.31 μs
Complex object	1.00x	0.99x	1522.59 μs
Dict with 256 lists of 256 dicts with 1 int	1.00x	1.07x	23563.12 μs

encode	json (setuptools)	jsonyx (main)	reference time
List of 256 ASCII strings	1.00x	0.47x	66.49 μs
List of 256 dicts with 1 int	1.00x	0.94x	94.91 μs
Medium complex object	1.00x	0.91x	146.82 μs
List of 256 strings	1.00x	2.76x	323.10 μs
Complex object	1.00x	1.26x	1523.92 μs
Dict with 256 lists of 256 dicts with 1 int	1.00x	0.92x	22958.90 μs

methane · 2025-04-30T15:35:41Z

How about adding if (copy_len > 0) before PyUnicodeWriter_WriteSubstring?

nineteendo · 2025-04-30T15:45:23Z

Better, but it's still twice as slow:

encode	json (setuptools)	jsonyx	reference time
List of 256 ASCII strings	1.00x	0.60x	50.39 μs
List of 256 dicts with 1 int	1.00x	0.92x	91.32 μs
Medium complex object	1.00x	0.87x	144.80 μs
List of 256 strings	1.00x	2.05x	305.92 μs
Complex object	1.00x	1.15x	1543.54 μs
Dict with 256 lists of 256 dicts with 1 int	1.00x	0.91x	23013.43 μs

nineteendo · 2025-04-30T16:30:38Z

How about just writing strings without escapes directly to the unicode writer?
Because the main performance improvement of this PR is simply to avoid creating a new string.

_PyUnicodeWriter_WriteChar(writer, '"')
_PyUnicodeWriter_WriteStr(writer, pystr) // original string
_PyUnicodeWriter_WriteChar(writer, '"')

nineteendo · 2025-04-30T17:10:31Z

Results of that (nineteendo/jsonyx@7c31ee4):

encode	json (setuptools)	jsonyx (main)	reference time
List of 256 ASCII strings	1.00x	0.45x	50.34 μs
List of 256 dicts with 1 int	1.00x	0.86x	91.83 μs
Medium complex object	1.00x	0.86x	141.84 μs
List of 256 strings	1.00x	0.97x	313.86 μs
Complex object	1.00x	1.03x	1529.10 μs
Dict with 256 lists of 256 dicts with 1 int	1.00x	0.86x	23190.66 μs

It's going to be a little harder to apply the change here (unless we just duplicate the functions).

nineteendo · 2025-04-30T17:25:28Z

I would still like a proper fix for faster-cpython/ideas#726 though. Should we just switch back to the private API?

nineteendo · 2025-05-01T06:39:12Z

See #133239 for my approach.

methane · 2025-05-01T08:53:26Z

https://gist.github.com/methane/e080ec9783db2a313f40a2b9e1837e72

Benchmark	main	patched2
json_dumps: List of 256 booleans	16.6 us	16.5 us: 1.01x faster
json_dumps: List of 256 ASCII strings	67.9 us	34.7 us: 1.96x faster
json_dumps: List of 256 dicts with 1 int	122 us	101 us: 1.21x faster
json_dumps: Medium complex object	205 us	173 us: 1.18x faster
json_dumps: List of 256 strings	330 us	302 us: 1.09x faster
json_dumps: Complex object	2.57 ms	1.96 ms: 1.31x faster
json_dumps: Dict with 256 lists of 256 dicts with 1 int	30.5 ms	26.5 ms: 1.15x faster
json_dumps(ensure_ascii=False): List of 256 booleans	16.6 us	16.5 us: 1.01x faster
json_dumps(ensure_ascii=False): List of 256 ASCII strings	68.1 us	34.6 us: 1.96x faster
json_dumps(ensure_ascii=False): List of 256 dicts with 1 int	122 us	101 us: 1.21x faster
json_dumps(ensure_ascii=False): Medium complex object	205 us	172 us: 1.19x faster
json_dumps(ensure_ascii=False): List of 256 strings	329 us	303 us: 1.09x faster
json_dumps(ensure_ascii=False): Complex object	2.56 ms	1.95 ms: 1.31x faster
json_dumps(ensure_ascii=False): Dict with 256 lists of 256 dicts with 1 int	30.6 ms	26.5 ms: 1.15x faster
json_loads: List of 256 booleans	9.01 us	9.09 us: 1.01x slower
json_loads: List of 256 ASCII strings	40.7 us	40.2 us: 1.01x faster
json_loads: List of 256 floats	91.4 us	88.3 us: 1.03x faster
json_loads: Medium complex object	150 us	147 us: 1.02x faster
json_loads: List of 256 strings	848 us	816 us: 1.04x faster
json_loads: Dict with 256 lists of 256 dicts with 1 int	46.5 ms	46.7 ms: 1.00x slower
json_loads: List of 256 stringsensure_ascii=False	85.2 us	85.7 us: 1.01x slower
Geometric mean	(ref)	1.13x faster

Benchmark hidden because not significant (5): json_dumps: List of 256 floats, json_dumps(ensure_ascii=False): List of 256 floats, json_loads: List of 256 dicts with 1 int, json_loads: Complex object, json_loads: Complex objectensure_ascii=False

methane · 2025-05-01T10:26:08Z

This PR is faster, but #133239 is enough for fixing regression from Python 3.13.

For longer term, encoder should use private (maybe utf-8) buffer instead of PyUnicodeWriter.
Calling overhead of PyUnicodeWriter is not negligible. It is enough for "much faster than pure Python", but not enough for JSON serializer.

nineteendo · 2025-05-01T13:04:23Z

This PR is faster, but #133239 is enough for fixing regression from Python 3.13.

It's still not fully fixed, encoding booleans is twice as slow. And I don't fully understand why this PR is faster.

mdboom · 2025-05-01T13:15:22Z

Just as a data point, on our Faster CPython infrastructure, this makes the json_dumps benchmark 14.8% faster than main, and is within the noise as the same performance as 3.13.0.

I will also kick off a run on #133239 for comparison.

methane · 2025-05-02T06:04:32Z

Using _PyUnicodeWriter_WriteASCIIString() instead of PyUnicodeWriter_WriteUTF8:

$ ./python -m pyperf compare_to with-fast-path.json use_write_ascii.json -G
Slower (3):
- json_dumps(ensure_ascii=False): List of 256 dicts with 1 int: 101 us +- 0 us -> 102 us +- 0 us: 1.00x slower
- json_loads: Dict with 256 lists of 256 dicts with 1 int: 46.6 ms +- 0.1 ms -> 46.8 ms +- 0.5 ms: 1.00x slower
- json_dumps(ensure_ascii=False): List of 256 floats: 239 us +- 1 us -> 239 us +- 1 us: 1.00x slower

Faster (10):
- json_dumps(ensure_ascii=False): List of 256 strings: 303 us +- 5 us -> 279 us +- 3 us: 1.08x faster
- json_dumps: List of 256 strings: 302 us +- 3 us -> 278 us +- 3 us: 1.08x faster
- json_dumps(ensure_ascii=False): List of 256 booleans: 16.5 us +- 0.1 us -> 15.3 us +- 0.1 us: 1.08x faster
- json_dumps: List of 256 booleans: 16.5 us +- 0.1 us -> 15.3 us +- 0.1 us: 1.07x faster
- json_dumps: Complex object: 1.96 ms +- 0.01 ms -> 1.87 ms +- 0.01 ms: 1.05x faster
- json_dumps(ensure_ascii=False): Complex object: 1.96 ms +- 0.01 ms -> 1.87 ms +- 0.02 ms: 1.05x faster
- json_dumps: Medium complex object: 173 us +- 1 us -> 171 us +- 1 us: 1.01x faster
- json_dumps(ensure_ascii=False): Medium complex object: 172 us +- 1 us -> 171 us +- 1 us: 1.01x faster
- json_loads: Medium complex object: 148 us +- 1 us -> 147 us +- 1 us: 1.00x faster
- json_dumps: List of 256 floats: 239 us +- 0 us -> 239 us +- 0 us: 1.00x faster

Benchmark hidden because not significant (13): json_dumps: List of 256 ASCII strings, json_dumps: List of 256 dicts with 1 int, json_dumps: Dict with 256 lists of 256 dicts with 1 int, json_dumps(ensure_ascii=False): List of 256 ASCII st
rings, json_dumps(ensure_ascii=False): Dict with 256 lists of 256 dicts with 1 int, json_loads: List of 256 booleans, json_loads: List of 256 ASCII strings, json_loads: List of 256 floats, json_loads: List of 256 dicts with 1 int, json_l
oads: List of 256 strings, json_loads: Complex object, json_loads: List of 256 stringsensure_ascii=False, json_loads: Complex objectensure_ascii=False

Patch:

diff --git a/Modules/_json.c b/Modules/_json.c
index cd08fa688d3..cd57760282a 100644
--- a/Modules/_json.c
+++ b/Modules/_json.c
@@ -351,7 +351,7 @@ write_escaped_ascii(PyUnicodeWriter *writer, PyObject *pystr)
         }

         if (buf_len + 12 > ESCAPE_BUF_SIZE) {
-            ret = PyUnicodeWriter_WriteUTF8(writer, buf, buf_len);
+            ret = _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, buf, buf_len);
             if (ret) return ret;
             buf_len = 0;
         }
@@ -359,7 +359,7 @@ write_escaped_ascii(PyUnicodeWriter *writer, PyObject *pystr)

     assert(buf_len < ESCAPE_BUF_SIZE);
     buf[buf_len++] = '"';
-    return PyUnicodeWriter_WriteUTF8(writer, buf, buf_len);
+    return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, buf, buf_len);
 }

 static int
@@ -1612,13 +1612,13 @@ encoder_listencode_obj(PyEncoderObject *s, PyUnicodeWriter *writer,
     int rv;

     if (obj == Py_None) {
-      return PyUnicodeWriter_WriteUTF8(writer, "null", 4);
+      return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, "null", 4);
     }
     else if (obj == Py_True) {
-      return PyUnicodeWriter_WriteUTF8(writer, "true", 4);
+      return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, "true", 4);
     }
     else if (obj == Py_False) {
-      return PyUnicodeWriter_WriteUTF8(writer, "false", 5);
+      return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, "false", 5);
     }
     else if (PyUnicode_Check(obj)) {
         return encoder_write_string(s, writer, obj);
@@ -1779,7 +1779,7 @@ encoder_listencode_dict(PyEncoderObject *s, PyUnicodeWriter *writer,

     if (PyDict_GET_SIZE(dct) == 0) {
         /* Fast path */
-        return PyUnicodeWriter_WriteUTF8(writer, "{}", 2);
+        return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, "{}", 2);
     }

     if (s->markers != Py_None) {
@@ -1883,7 +1883,7 @@ encoder_listencode_list(PyEncoderObject *s, PyUnicodeWriter *writer,
         return -1;
     if (PySequence_Fast_GET_SIZE(s_fast) == 0) {
         Py_DECREF(s_fast);
-        return PyUnicodeWriter_WriteUTF8(writer, "[]", 2);
+        return _PyUnicodeWriter_WriteASCIIString((_PyUnicodeWriter*)writer, "[]", 2);
     }

     if (s->markers != Py_None) {

json: Optimize escaping string in Encoder

c2f1010

methane added performance Performance or resource usage skip issue extension-modules C modules in the Modules dir labels Apr 30, 2025

bedevere-app bot added the awaiting core review label Apr 30, 2025

add news and whatsnew

59e5131

methane force-pushed the optimize-json-encode branch from cd2e18c to 59e5131 Compare April 30, 2025 07:32

methane requested a review from Copilot April 30, 2025 07:33

This comment was marked as resolved.

Sign in to view

methane mentioned this pull request Apr 30, 2025

Understand the outlier benchmarks on 3.14 (main) vs. 3.13.0 faster-cpython/ideas#726

Open

25 tasks

nineteendo reviewed Apr 30, 2025

View reviewed changes

Modules/_json.c Show resolved Hide resolved

methane added 2 commits April 30, 2025 20:38

add comment

ee1a7f6

apply suggested change

d026be3

methane force-pushed the optimize-json-encode branch from 646f257 to 5c8fcf9 Compare May 1, 2025 03:57

use tmp buffer

8e5e00b

methane force-pushed the optimize-json-encode branch from 5c8fcf9 to 8e5e00b Compare May 1, 2025 06:17

use UCS4 instead of UTF8

19c0f1f

methane force-pushed the optimize-json-encode branch from b66863d to 19c0f1f Compare May 1, 2025 08:01

methane mentioned this pull request May 1, 2025

json: Fast path for string encoding #133239

Draft

methane requested a review from vstinner May 2, 2025 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

json: Optimize escaping string in Encoder #133186

json: Optimize escaping string in Encoder #133186

methane commented Apr 30, 2025

This comment was marked as resolved.

methane commented Apr 30, 2025

mdboom commented Apr 30, 2025

nineteendo commented Apr 30, 2025

methane commented Apr 30, 2025 via email •

edited

Loading

nineteendo commented Apr 30, 2025

nineteendo commented Apr 30, 2025 •

edited

Loading

nineteendo commented Apr 30, 2025 •

edited

Loading

nineteendo commented Apr 30, 2025

nineteendo commented May 1, 2025

methane commented May 1, 2025

methane commented May 1, 2025 •

edited

Loading

nineteendo commented May 1, 2025

mdboom commented May 1, 2025

methane commented May 2, 2025

json: Optimize escaping string in Encoder #133186

Are you sure you want to change the base?

json: Optimize escaping string in Encoder #133186

Conversation

methane commented Apr 30, 2025

This comment was marked as resolved.

methane commented Apr 30, 2025

mdboom commented Apr 30, 2025

nineteendo commented Apr 30, 2025

methane commented Apr 30, 2025 via email • edited Loading

nineteendo commented Apr 30, 2025

nineteendo commented Apr 30, 2025 • edited Loading

nineteendo commented Apr 30, 2025 • edited Loading

nineteendo commented Apr 30, 2025

nineteendo commented May 1, 2025

methane commented May 1, 2025

methane commented May 1, 2025 • edited Loading

nineteendo commented May 1, 2025

mdboom commented May 1, 2025

methane commented May 2, 2025

methane commented Apr 30, 2025 via email •

edited

Loading

nineteendo commented Apr 30, 2025 •

edited

Loading

nineteendo commented Apr 30, 2025 •

edited

Loading

methane commented May 1, 2025 •

edited

Loading