Skip to content

GH-137759:Limit _PyObject_HashFast to dicts keys, rename it, and mark it as Py_ALWAYS_INLINE #137828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

JasonMendoza2008
Copy link

@JasonMendoza2008 JasonMendoza2008 commented Aug 15, 2025

Context:

setobject.c used a mix of _PyObject_HashFast and PyObject_Hash. The consensus1 was that _PyObject_HashFast should only really be used for dict keys (string-only sets are not so omnipresent). This PR uses the standard PyObject_Hash without the fast path (making set[Any] faster and set[str] slower), renames _PyObject_HashFast since its name was misleading and marked it as Py_ALWAYS_INLINE to force inlining.

Pretty sure this doesn't require a NEWS entry?

Footnotes

  1. There were discussions in the same issue around removing _PyObject_HashFast completely and adding the fast path for strings (and even potentially integers) to PyObject_Hash but there was no consensus so I'm not including it for now.

@python-cla-bot
Copy link

python-cla-bot bot commented Aug 15, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app

This comment was marked as resolved.

@bedevere-app

This comment was marked as resolved.

@ZeroIntensity
Copy link
Member

making set[Any] faster and set[str] slower

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want _PyObject_HashFast/_PyObject_HashDictKey for inlining purposes, but I don't see why the fast path has to be limited to it.

@JasonMendoza2008
Copy link
Author

JasonMendoza2008 commented Aug 15, 2025

making set[Any] faster and set[str] slower

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want _PyObject_HashFast/_PyObject_HashDictKey for inlining purposes, but I don't see why the fast path has to be limited to it.

I'm fine with adding a fast-path for strings and integers (but then do we even need _PyObject_HashFast/_PyObject_HashDictKey , we could just put everything in PyObject_Hash1). I didn't since there was no consensus.

Footnotes

  1. However, we probably can't inline that function since it's bigger? I'm not sure... I'm no expert in what's too big to be inlined.

@ZeroIntensity
Copy link
Member

but then do we even need HashFast, we could just put everything in PyObject_Hash

Yeah, because even if there's no speedup there, it's just simpler maintenance (in the future, we might want to make other assumptions about a dictionary key's hash that improve performance).

@JasonMendoza2008
Copy link
Author

JasonMendoza2008 commented Aug 15, 2025

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact

Playing devil's advocate: if we don't detect it, we're a a function pointer dereference and call away from getting the hash anyway, right? The hash is not recomputed without that fast-path since it's stored (for strings, but for integers it's just an AND operation if I remember correctly)?

@ZeroIntensity
Copy link
Member

A function pointer call has a lot more to do than just some reads, but maybe the overhead isn't noticeable?

@rhettinger rhettinger removed their request for review August 16, 2025 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants