-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-132983: Convert dict_content to take Py_buffer #133924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It would probably be good to run the refleaks buildbots on this PR before merge. |
@AA-Turner would you like to pick this up again? I'm happy to take this over if you don't have time. Want to get the last of the zstd items done before b2. |
🤖 New build scheduled with the buildbot fleet by @AA-Turner for commit fe3fb49 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133924%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again. |
Merged main because there were nonsense errors in the tests. |
90cef43
to
30ac9d5
Compare
🤖 New build scheduled with the buildbot fleet by @AA-Turner for commit 30ac9d5 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133924%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again. |
Apologies for the force-push, my local git state got messy. |
Modules/_zstd/zstddict.c
Outdated
"Zstandard dictionary content must be longer " | ||
"than eight bytes."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also be exactly 8 bytes, right?
What happens if you remove this check? What error will you get?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording error, sorry.
What happens if you remove this check? What error will you get?
From RFC 8878 s.5, a dictionary can either be in Zstandard format or 'raw content' format.
-
"raw content" dictionaries ... must be at least 8 bytes.
- The Zstandard format has a required 4 byte magic number and 4 byte ID
This check only bites when is_raw
is True
, as normally we check that the dict is a valid Zstandard directory via ZSTD_getDictID_fromDict() != 0
. That function returns 0 if dictSize < 8
or otherwise if the data does not match the Zstandard header.
The Zstandard documentation is quite sparse on 'raw content' dictionaries, but I would conclude that the early error here probably doesn't hurt, and helps ensure we are conformant to the RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no principal difference between keeping binary data as a bytes object or a raw memory buffer. But if the latter looks better to you, LGTM.
Thanks @AA-Turner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…()`` (pythonGH-133924) (cherry picked from commit f2ce4bb) Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
GH-134723 is a backport of this pull request to the 3.14 branch. |
ZstdDict
'sdict_content
is just binary data, we don't need to allocate an entire PyBytesObject for it. This also lets us use thePy_buffer
argument clinic converter.We replace the
dict_content
member ofZstdDict
withdict_buffer
anddict_len
.ZstdDict.dict_content
is still exposed as a bytes object to Python via a new getter.A
cc @Rogdham