-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-91603: Speed up UnionType
instantiation
#91865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this overall makes sense to me, but I found a few style issues and one bigger problem.
@@ -468,23 +547,12 @@ make_union(PyObject *args) | |||
{ | |||
assert(PyTuple_CheckExact(args)); | |||
|
|||
args = dedup_and_flatten_args(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need this logic for the callsite in union_getitem
.
Consider this case:
>>> from typing import TypeVar
>>> T = TypeVar("T")
>>> (list[T] | list[int])
list[~T] | list[int]
>>> (list[T] | list[int])[int]
list[int]
If I'm reading your code correctly, it would no longer deduplicate the two members. It would be good to also add a test case based on this example.
We should probably put the deduping logic directly in union_getitem
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JelleZijlstra Actually, we don't need it, as far as such case already handled by tests:
cpython/Lib/test/test_types.py
Line 823 in 1cd8c29
self.assertEqual((list[T] | list[S])[int, int], list[int]) |
That's because after parameter substitution new UnionType
is recreated by reducing newargs
using bitwise or
operator:
Lines 380 to 388 in 1cd8c29
res = PyTuple_GET_ITEM(newargs, 0); | |
Py_INCREF(res); | |
for (Py_ssize_t iarg = 1; iarg < nargs; iarg++) { | |
PyObject *arg = PyTuple_GET_ITEM(newargs, iarg); | |
Py_SETREF(res, PyNumber_Or(res, arg)); | |
if (res == NULL) { | |
break; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch! That seems a bit inefficient too, but we don't need to fix that in this PR; in any case it's probably a less performance-critical path.
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks. Did not review the code deeply yet.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
I have made the requested changes; please review again |
Thanks for making the requested changes! @JelleZijlstra, @serhiy-storchaka: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, I just have one question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, except _PyTuple_Resize
and some formatting.
But the code can be much smaller. I have refactored it in #91955.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Although I prefer #91955, which is simpler. 😉
Closed because #91955 merged |
#91603
Summary
Remove
types.UnionType.__args__
recreation (flatten_args
anddedup_and_flatten_args
).Use the fact that in the case when union with another
types.UnionType
that union has deduplicated normalized__args__
.As a result complexity from
O((M+N)^2)
was reduced toO(M*N)
, whereM
- length of left union args,N
- length of right union args.Benchmarks:
For big unions: