-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Difference between pickle.py and _pickle for certain strings #113028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd happily provide a PR. It seems only necessary to introduce a temporary variable so the original |
I'll be glad to review your PR. |
jeff5
added a commit
to jeff5/cpython
that referenced
this issue
Dec 23, 2023
This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped.
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this issue
Dec 24, 2023
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped. (cherry picked from commit 0839863) Co-authored-by: Jeff Allen <ja.py@farowl.co.uk>
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this issue
Dec 24, 2023
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped. (cherry picked from commit 0839863) Co-authored-by: Jeff Allen <ja.py@farowl.co.uk>
This was referenced Dec 24, 2023
serhiy-storchaka
pushed a commit
that referenced
this issue
Dec 24, 2023
…H-113436) (GH-113448) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped. (cherry picked from commit 0839863) Co-authored-by: Jeff Allen <ja.py@farowl.co.uk>
serhiy-storchaka
pushed a commit
that referenced
this issue
Dec 24, 2023
…H-113436) (GH-113449) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped. (cherry picked from commit 0839863) Co-authored-by: Jeff Allen <ja.py@farowl.co.uk>
ryan-duve
pushed a commit
to ryan-duve/cpython
that referenced
this issue
Dec 26, 2023
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped.
kulikjak
pushed a commit
to kulikjak/cpython
that referenced
this issue
Jan 22, 2024
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…ythonGH-113436) This fixes a divergence between the Python and C implementations of pickle for protocol 0, such that it pickle.py fails to re-use the first pickled representation of strings involving characters that have to be escaped.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug report
Bug description:
There is a logical error in
pickle.Pickler.save_str
for protocol 0, such that it repeats pickling of a string object each time it is presented. The design clearly intends to re-use the first pickled representation, and the C-implementation_pickle
does that.In an implementation that does not provide a compiled
_pickle
(PyPy may be one) this is inefficient, but not actually wrong. The intended behaviour occurs with a simple string:When read by
loads()
this string says:The bug emerges when the pickled string needs pre-encoding:
Here we see identical data stacked and saved (but not used). The problem is here:
cpython/Lib/pickle.py
Lines 860 to 866 in 42a86df
where the return from
obj.replace
may be a different object fromobj
. In CPython, that is only if a replacement takes place, which is why the problem only appears in the second case above.save_str
is only called when the object has not already been memoized, but in the cases at issue, the string memoized is not the original object, and so when the original string object is presented again,save_str
is called again.Depending upon the detailed behaviour of
str.replace
(in particular, if you decide to return an interned value when the result is, say, a Latin-1 character) an assertion may fail inmemoize()
:cpython/Lib/pickle.py
Lines 504 to 507 in 42a86df
AssertionError
in CPython.This has probably gone unnoticed so long only because
pickle.py
is not tested. (At least, I think it isn't. #105250 and #53350 note this coverage problem.)CPython versions tested on:
3.11
Operating systems tested on:
Windows
Linked PRs
The text was updated successfully, but these errors were encountered: