-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-132983: Convert dict_content to take Py_buffer #133924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
self->dict_buffer = PyMem_RawMalloc(dict_content->len); | ||
if (!self->dict_buffer) { | ||
return PyErr_NoMemory(); | ||
} | ||
memcpy(self->dict_buffer, dict_content->buf, dict_content->len); | ||
self->dict_len = dict_content->len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @vstinner, I think RawMalloc
is correct here but would be grateful for a review / confirmation please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you told the GIL in this code, so you can use PyMem_Malloc() instead. The Raw variant is when you don't hold the GIL.
@@ -60,24 +59,28 @@ _zstd_ZstdDict_new_impl(PyTypeObject *type, PyObject *dict_content, | |||
} | |||
|
|||
/* Check dict_content's type */ | |||
self->dict_content = PyBytes_FromObject(dict_content); | |||
if (self->dict_content == NULL) { | |||
if (dict_content == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is never NULL.
BTW, overriding arbitrary exception with a TypeError in the current code is wrong.
if (Py_SIZE(self->dict_content) < 8) { | ||
self->dict_buffer = PyMem_RawMalloc(dict_content->len); | ||
if (!self->dict_buffer) { | ||
return PyErr_NoMemory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self
is leaked here.
@@ -124,23 +124,33 @@ PyDoc_STRVAR(ZstdDict_dictid_doc, | |||
"The special value '0' means a 'raw content' dictionary," | |||
"without any restrictions on format or content."); | |||
|
|||
PyDoc_STRVAR(ZstdDict_dictcontent_doc, | |||
"The content of a Zstandard dictionary, as a bytes object."); | |||
|
|||
static PyObject * | |||
ZstdDict_str(PyObject *ob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is wrong to implement __str__
. __repr__
should be implemented.
static PyObject * | ||
ZstdDict_str(PyObject *ob) | ||
{ | ||
ZstdDict *dict = ZstdDict_CAST(ob); | ||
return PyUnicode_FromFormat("<ZstdDict dict_id=%u dict_size=%zd>", | ||
dict->dict_id, Py_SIZE(dict->dict_content)); | ||
dict->dict_id, dict->dict_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dict->dict_id, dict->dict_len); | |
(unsigned int)dict->dict_id, dict->dict_len); |
self->dict_len = dict_content->len; | ||
|
||
/* Both ordinary and "raw content" dictionaries must be 8 bytes minimum */ | ||
if (self->dict_len < 8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be checked at the beginning, before any allocations.
} | ||
|
||
self->dict_content = NULL; | ||
self->d_dict = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self->d_dict = NULL; | |
self->d_dict = NULL; | |
self->dict_buffer = NULL; |
Otherwise self->dict_buffer
is not initialized in the destructor.
char *dict_buffer = self->dict_buffer; | ||
Py_ssize_t dict_len = self->dict_len; | ||
Py_BEGIN_ALLOW_THREADS | ||
self->d_dict = ZSTD_createDDict(dict_buffer, | ||
dict_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just inline them, as in the code below.
char *dict_buffer = self->dict_buffer; | |
Py_ssize_t dict_len = self->dict_len; | |
Py_BEGIN_ALLOW_THREADS | |
self->d_dict = ZSTD_createDDict(dict_buffer, | |
dict_len); | |
Py_BEGIN_ALLOW_THREADS | |
self->d_dict = ZSTD_createDDict(self->dict_buffer, self->dict_len); |
char *dict_buffer = self->dict_buffer; | ||
Py_ssize_t dict_len = self->dict_len; | ||
Py_BEGIN_ALLOW_THREADS | ||
cdict = ZSTD_createCDict(dict_buffer, | ||
dict_len, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char *dict_buffer = self->dict_buffer; | |
Py_ssize_t dict_len = self->dict_len; | |
Py_BEGIN_ALLOW_THREADS | |
cdict = ZSTD_createCDict(dict_buffer, | |
dict_len, | |
Py_BEGIN_ALLOW_THREADS | |
cdict = ZSTD_createCDict(self->dict_buffer, self->dict_len, |
ZstdDict
'sdict_content
is just binary data, we don't need to allocate an entire PyBytesObject for it. This also lets us use thePy_buffer
argument clinic converter.We replace the
dict_content
member ofZstdDict
withdict_buffer
anddict_len
.ZstdDict.dict_content
is still exposed as a bytes object to Python via a new getter.A
cc @Rogdham