Skip to content

[3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) #107358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Jul 27, 2023

We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.

Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
(cherry picked from commit b72947a)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com

…gh-107272)

We tried this before with a dict and for all interned strings.  That ran into problems due to interpreter isolation.  However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning.  Here we circle back to using a global cache, but only for statically allocated strings.  We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.

Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter.  Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
(cherry picked from commit b72947a)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
@ericsnowcurrently
Copy link
Member

As noted elsewhere, I'm going to mull over whether or not this is worth it in 3.12.

CC @Yhg1s

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Jul 27, 2023

This also needs gh-107362.

@ericsnowcurrently
Copy link
Member

Having slept on it, I don't see this change as important enough to be worth backporting, particularly given the release manager's reservations.

@miss-islington miss-islington deleted the backport-b72947a-3.12 branch July 28, 2023 16:42
@ericsnowcurrently
Copy link
Member

This may be needed to fix gh-110279.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants