-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
The hash and equals methods of code objects are large, complex and unnecessary. #101346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This will involves a slight change in semantics: >>> def (): pass
>>> c = f.__code__
>>> c.replace() == c
True With this change: >>> c.replace() == c
False |
It's a good idea. How i understand, we need to delete |
I assume this meant to say "by identity."
So do we need a SC ruling on whether this needs a deprecation process? I guess technically we could raise a |
I put up a pull request for this, and looking at the resulting test failures in the CPython test suite alone has changed my mind about whether this is a good idea (or at the very least, whether we can do it without a deprecation process.) Dynamically compiling Python source strings to code objects is not uncommon, and in that scenario it is a) quite likely that the same input will result in the "same" output code object, position information and all, and b) quite likely that the test suite for such dynamic-compilation code may be using code object equality. And in fact this case does occur in our own test suite (e.g. in I didn't see any reasonable way to fix |
We have lots of tests that test artifacts of the implementation, rather than any spec. |
Indeed, I hoped initially it was not as bad as it appeared! But when I dug into trying to fix |
What is Whatever notion of equivalence it has should be spelled, rather than using |
Given the bytecode changes from one version to the next, I suspect a change in code object equality will be the least of their problems. We have changed the behavior of code object equality several times in the past. |
I think in principle what
Has it ever been defined in a way that would mean that
Constants make this non-trivial, since you have to check the types of constants and, if they are code objects, recursively use whatever definition of pseudo-code-object equality you have chosen on those nested code objects too, instead of just comparing them. |
We should just compare code objects by identity.
Technically, they are immutable values but realistically, there is only on code object per source unit.
Attempting to maintain value semantics when the underlying code object is internally mutable (due to specialization and instrumentation) is cumbersome and probably buggy, and will only get worse with more optimizations.
We should just delete the hash and comparison methods and have identity comparison. It will be faster too.
Linked PRs
The text was updated successfully, but these errors were encountered: