Skip to content

bpo-42536: GC track recycled tuples #23623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Dec 5, 2020

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Dec 3, 2020

This fixes possibly untracked reference cycles in:

  • collections.OrderedDict.items
  • dict.items
  • enumerate
  • functools.reduce
  • itertools.combinations
  • itertools.combinations_with_replacement
  • itertools.permutations
  • itertools.product
  • itertools.zip_longest
  • zip

https://bugs.python.org/issue42536

@brandtbucher brandtbucher requested a review from vstinner December 3, 2020 16:58
Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @brandtbucher for giving this a go. I think that there is a better approach here: force the result to be tracked on the visiting function. This has the advantage of adding only an overhead when the GC calls the visiting function over the zip object as opposed to pay per call to next(). To be clear, this is what I mean:

 a/Python/bltinmodule.c b/Python/bltinmodule.c
index a73b8cb320..b05239c037 100644
--- a/Python/bltinmodule.c
+++ b/Python/bltinmodule.c
@@ -2604,6 +2604,9 @@ static int
 zip_traverse(zipobject *lz, visitproc visit, void *arg)
 {
     Py_VISIT(lz->ittuple);
+    if (!_PyObject_GC_IS_TRACKED(lz->result)) {
+        _PyObject_GC_TRACK(Zz->result);
+    }
     Py_VISIT(lz->result);
     return 0;
 }

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@pablogsal pablogsal dismissed their stale review December 3, 2020 20:12

Actually, I am not that sure that this is the better way, as the current approach is formally correct as this should be done when the tuple is mutated (as we do in other containers).

@pablogsal
Copy link
Member

The only reason I think we should consider doing this on the visiting function is due to these results:

https://bugs.python.org/msg382365

@brandtbucher
Copy link
Member Author

Thanks Pablo. See my thoughts here:

https://bugs.python.org/msg382455

If that doesn't worry you at all, I'm fine moving this to the traverse function.

@brandtbucher
Copy link
Member Author

brandtbucher commented Dec 3, 2020

Oh, wait. I see now that GitHub says you "dismissed" your "stale review". I'm not sure what that means... are we in agreement here to keep this as-is?

@pablogsal
Copy link
Member

pablogsal commented Dec 3, 2020

Oh, wait. I see now that GitHub says you "dismissed" your "stale review". I'm not sure what that means... are we in agreement here to keep this as-is?

Sorry, I commented it here: https://bugs.python.org/msg382458. I think that unless we see a horrendous performance regression, I prefer this approach than the one I proposed using tp_traverse.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I left some minor coding style remarks (you're free to ignore them).

I also suggest you to copy your NEWS entry as the commit message.

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the latest round of updates ;-)

@brandtbucher brandtbucher merged commit 226a012 into python:master Dec 5, 2020
@miss-islington
Copy link
Contributor

Thanks @brandtbucher for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8, 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @brandtbucher, I could not cleanly backport this to 3.9 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 226a012d1cd61f42ecd3056c554922f359a1a35d 3.9

@miss-islington
Copy link
Contributor

Sorry @brandtbucher, I had trouble checking out the 3.8 backport branch.
Please backport using cherry_picker on command line.
cherry_picker 226a012d1cd61f42ecd3056c554922f359a1a35d 3.8

brandtbucher added a commit to brandtbucher/cpython that referenced this pull request Dec 5, 2020
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:

- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip

Previously, they could have become untracked by a prior garbage collection.
(cherry picked from commit 226a012)
brandtbucher added a commit to brandtbucher/cpython that referenced this pull request Dec 5, 2020
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:

- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip

Previously, they could have become untracked by a prior garbage collection.
(cherry picked from commit 226a012)
@bedevere-bot
Copy link

GH-23651 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Dec 5, 2020
@bedevere-bot
Copy link

GH-23652 is a backport of this pull request to the 3.8 branch.

pablogsal pushed a commit that referenced this pull request Dec 7, 2020
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:

- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip

Previously, they could have become untracked by a prior garbage collection.
(cherry picked from commit 226a012)
pablogsal pushed a commit that referenced this pull request Dec 7, 2020
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:

- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip

Previously, they could have become untracked by a prior garbage collection.
(cherry picked from commit 226a012)
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:

- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip

Previously, they could have become untracked by a prior garbage collection.
@brandtbucher brandtbucher deleted the untracked-zip-result branch July 21, 2022 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants