Skip to content

gh-132762: Fix underallocation bug in dict.fromkeys() and expand test coverage #133627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 8, 2025

Conversation

angela-tarantula
Copy link
Contributor

@angela-tarantula angela-tarantula commented May 8, 2025

Closes #132762

Summary

dict_set_fromkeys() was only sizing its new table by the size of the iterable input, ignoring any existing entries in the dictionary. This triggered an infinite loop in dictresize() whenever the new dictionary size was too small to reinsert those entries. This patch adds the same Py_MAX(…, DK_LOG_SIZE(mp->ma_keys)) guard that dict_dict_fromkeys() uses, so we never accidentally shrink the table below its current capacity. The relevant test case baddict3 has been updated to cover this edge case.

For more background, see my proposal.

3 New Regression Tests

Ever since the fast path was updated, the slow path completely lost test coverage. To rectify this, I added 3 new tests:

  • 1 slow-path test when the iterable input is neither a set nor a dictionary
  • 1 slow-path test when fromkeys() is called on a proper subclass of dict, baddict4
  • 1 fast-path test when the input is a set (worth testing explicitly now that dict_dict_fromkeys() and dict_set_fromkeys() are separate)

Thanks for the review! @DinoV @colesbury

previously covered:
 - fast path for dictionary inputs
 - fast path when object's constructor returns non-empty dict (too small
   for good coverage)

now additionally covered:
 - fast path for set inputs
 - slow path for non-set, non-dictionary inputs
 - fast path when object's constructor returns *large* non-empty dict
 - slow path when object is a proper subclass of dict
@python-cla-bot
Copy link

python-cla-bot bot commented May 8, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented May 8, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@colesbury colesbury self-requested a review May 8, 2025 17:01
@colesbury colesbury added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels May 8, 2025
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm!

@colesbury colesbury merged commit 421ba58 into python:main May 8, 2025
47 checks passed
@miss-islington-app
Copy link

Thanks @angela-tarantula for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 8, 2025
…h-133627)

The function `dict_set_fromkeys()` adds elements of a set to an existing
dictionary. The size of the expanded dictionary was estimated with
`PySet_GET_SIZE(iterable)`, which did not take into account the size of the
existing dictionary.
(cherry picked from commit 421ba58)

Co-authored-by: Angela Liss <59097311+angela-tarantula@users.noreply.github.com>
@miss-islington-app
Copy link

Sorry, @angela-tarantula and @colesbury, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 421ba589d02b53131f793889d221ef3b1f1410a4 3.13

@bedevere-app
Copy link

bedevere-app bot commented May 8, 2025

GH-133685 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label May 8, 2025
colesbury pushed a commit to colesbury/cpython that referenced this pull request May 8, 2025
…ythongh-133627)

The function `dict_set_fromkeys()` adds elements of a set to an existing
dictionary. The size of the expanded dictionary was estimated with
`PySet_GET_SIZE(iterable)`, which did not take into account the size of the
existing dictionary.
(cherry picked from commit 421ba58)

Co-authored-by: Angela Liss <59097311+angela-tarantula@users.noreply.github.com>
@bedevere-app
Copy link

bedevere-app bot commented May 8, 2025

GH-133686 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label May 8, 2025
colesbury pushed a commit that referenced this pull request May 8, 2025
) (gh-133685)

The function `dict_set_fromkeys()` adds elements of a set to an existing
dictionary. The size of the expanded dictionary was estimated with
`PySet_GET_SIZE(iterable)`, which did not take into account the size of the
existing dictionary.
(cherry picked from commit 421ba58)

Co-authored-by: Angela Liss <59097311+angela-tarantula@users.noreply.github.com>
colesbury added a commit that referenced this pull request May 8, 2025
) (gh-133686)

The function `dict_set_fromkeys()` adds elements of a set to an existing
dictionary. The size of the expanded dictionary was estimated with
`PySet_GET_SIZE(iterable)`, which did not take into account the size of the
existing dictionary.
(cherry picked from commit 421ba58)

Co-authored-by: Angela Liss <59097311+angela-tarantula@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dict_set_fromkeys() calculates size of dictionary improperly
2 participants