-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
bpo-1635741: port _elementtree to multi-phase init (PEP 489) #23535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@vstinner @shihai1991 @corona10 please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch looks like it was a lot of work. Thanks for investing your time to port _elementtree
.
I'm a bit worried that the module state getter is going to affect performance negatively. The code has to use the slow _PyType_GetModuleByDef()
API. In my patch for the _ssl
module I added a module state pointer to instance structs. For example you could add state
to typedef struct ElementObject
and then reference the state with self->state
. This would reduce the amount of get state calls and simplify the code.
What do you think? Is this a safe apprach?
#define ET_STATE_GLOBAL \ | ||
((elementtreestate *) PyModule_GetState(PyState_FindModule(&elementtreemodule))) | ||
static inline elementtreestate * | ||
get_elementtree_module_state_by_object(void *object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The qualifier object
is not self-explanatory. Every Python object is an object. This includes module objects and type objects. How about:
get_elementtree_module_state_by_object(void *object) | |
get_elementtree_module_state_by_instance(PyObject *self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about expend get_element_module_state()
to receive self
object or type
object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiran I tried using PyObject *
and that gives a lot of compiler warnings (see somem below):
C:\Dev\OSS\cpython\cpython\Modules_elementtree.c(3749,74): warning C4133: 'function': incompatible types - from 'XMLParserObject *' to 'PyObject *' [C:\Dev\OSS\cpython\cpython\PCbuild_elementtree.vcxproj]
C:\Dev\OSS\cpython\cpython\Modules\clinic_elementtree.c.h(26,52): warning C4133: 'function': incompatible types - from 'ElementObject *' to 'PyObject *' [C:\Dev\OSS\cpython\cpython\PCbuild_elementtree.vcxproj]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shihai1991 do you mean two methods, one for each?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. just my personal suggestion :)
class _elementtree.Element "ElementObject *" "get_elementtree_module_state_by_object(self)->element_type" | ||
class _elementtree.TreeBuilder "TreeBuilderObject *" "get_elementtree_module_state_by_object(self)->tree_builder_type" | ||
class _elementtree.XMLParser "XMLParserObject *" "get_elementtree_module_state_by_object(self)->xml_parser_type" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? AFAIK the argument clinic code has only access to type
, not to self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiran This is where I got from trial and error, see the implementation of _elementtree_Element_append in Modules/clinic/_elementtree.c.h
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
@tiran I'm not sure if its safe but I understand it in principle. I don't know exactly the lifetime of instance structs when pickling. @vstinner thoughts on that approach? Is there a benchmark I could try to measure the performance impact? _PyType_GetModuleByDef seems to only be slow if there is significant subclassing so I could try to whip up an example that tries to overdo that. |
This PR introduces a ref. leak (looks similar to the ref. leak I had in PR #23428):
|
@erlend-aasland oof sorry, I didn't realize you were working on this module too. Any tips to find the cause of the ref leak or which reference is leaking? I'm not too familiar with the diagnostics & telemetry available. |
No problem, you're fine! Feel free to take a look at my PR. I haven't been able to devote any free time to this the last month or so, anyway.
Me neither. It kinda stopped when I encountered the ref. leak. I use |
This PR is stale because it has been open for 30 days with no activity. |
https://bugs.python.org/issue1635741 is closed. What is that status of this PR? |
The change is still relevant, but should use a new issue number. Moreover, the SC asked to put the conversion of static types to heap types on hold. @encukou and @erlend-aasland wrote https://peps.python.org/pep-0687/ which may unblock the situation but it's still a draft. |
Superseded by gh-101285 |
https://bugs.python.org/issue1635741