-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-123660: Internal macros for accessing empty string/bytes singletons #123643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@vstinner @encukou @erlend-aasland @zooba @serhiy-storchaka In our discussion on the |
The C API WG deals with public API. IMO the internal API is fine as it is; the rename doesn't bring much. |
I'm quite happy to review internal C API designs as well - we all still have to live with them. As a concept, this one looks fine to me. It'll help our contributors discover how best to access these singletons, as it isn't exactly obvious. If/when we make another way to get them it'll be easier to update. As a PR, I think there are some generated files you've touched that will be undone when they regenerate (though I haven't checked for sure). Make sure you do a |
I think there's already a CI job that checks whether the files are up-to-date or not and I locally did regenerate them so they should be part of the PR. It should be fine I think. I'll create an issue soon (I think we can make a meta-issue to integrate the empty tuple case as well). |
For now, I'm converting it into a draft until I've resolved my own interrogations on whether to replace |
_Py_STR() and _Py_SINGLETON() are replaced with direct pointers at build time, whereas PyUnicode_FromString("") and PyUnicode_New(0, 0) are implemented as regular function calls at runtime: it's slower. |
Yes, but is it safe to replace all occurrences? Actually, what I'd like to know whether there was a deeper meaning of not using the singleton (is it because when the code was written, this singleton did not yet exist? I think it was introduced in 2022). |
This ensures that the `_Py_EMPTY_STRING` keeps working.
I don't think that it's useful to replace all occurrences. |
Which is why I didn't do it and was interrogating myself on whether it's useful or not. Now that I have another opinion, I'll leave it as is. I don't want to have an immense diff so I kept it simple. Otherwise the number of occurrences to change is much larger and we'll have a lot of changed files (thereby creating conflicts in other PRs I think...) |
Are there some you know are not safe? Provided the reference counting works out (which may require additional changes), it should be fine to replace calls like |
I was extremely brief there, sorry. In the WG discussion, Marc-Andre asks for public API. And I think that is what we need first. Perhaps something like:
These should be documented. They're ”borrowed from the interpreter”, i.e they don't necessarily survive interpreter shutdown/reinitialization, and can't be shared between interpreters, although both will work in CPython. And when that's in place, these can get a faster implementation for non-limited API or for I don't think it's helpful to only add new undocumented internal macros. |
I didn't propose
It's strange to me to add a new macro relying on something only provided "for backward compatibility". If we add such macro, I would prefer to see the function documentation rephrased. Or maybe it's a bad idea to add a new macro which returns a borrowed reference, whereas the C API Guidelines is against adding new functions returning borrowed references. I don't know. |
I have mixed feelings about this change. I'm not convinced that it makes Python easier to maintain or easier to understand. I like using explicitly the complex _Py_STR() and _Py_SINGLETON() macros, but maybe it's just a personal taste :-) |
I agree with that.
Which guidelines? I guess we should clarify those first :) |
I suspect this is personal taste, because you are highly familiar with how it works under the covers :-) For someone who isn't so aware, I think they're unlikely to guess to write the more complex version, and so if we have an obvious macros that does it then they're more likely to choose something efficient. |
I'm closing this one since we didn't manage to find an answer to what we should replace at the end and whether this macro would be useful or not. We can revisit this question later (and there are anyway lots of conflicts now and I'm cleaning up my stale branches). |
This is a proposal for using macros instead of typing everytime
_Py_STR(empty)
or_Py_SINGLETON(bytes_empty)
This is a follow-up to capi-workgroup/decisions#36 (comment).
I wanted to use
Py_GetConstantBorrowed
instead, but I wasn't sure which approach was the most efficient one. InPy_GetConstantBorrowed
you have aconstant_id < Py_ARRAY_LENGTH(constants)
check and I'm not sure it's even possible to use the function since you'll need to create the constant in the arrayconstants
, and this usesPyUnicode_New(0, 0)
which itself returns the singleton. So you'll end up with a recursion issue.Anyway, it's still WIP although all occurrences to empty strings and bytes were replaced. I'm just opening a draft PR so it's easier to see the diff (I'll create an issue later).