-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-96538: Move some type-checking out of bisect loops #96539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if (res_obj == Py_NotImplemented) { | ||
Py_DECREF(res_obj); | ||
compare = NULL; | ||
res = PyObject_RichCompareBool(item, litem, Py_LT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: this could technically call __lt__, __lt__, __gt__
instead of the current behavior, calling __lt__, __gt__
.
This is similar to what unsafe_object_compare does though.
With sorting, which is potentially I recognize that there has been a shift in the core developer attitudes towards complexity and that many things we would have never considered now are fair game if some clock cycles can be saved. With the interpreter, there was a high payoff that compensated for fewer people being able to understand and maintain the code. There is less of case to be made for complexifying code on the periphery. It would be nice if some simple things remained simple. At one time, the whole module was basically just this code:
Tim, what are your thoughts? |
Note that The code seems pretty straightforward to me, and Dennis's timing results do show significant speedups for typical type-homogeneous cases, even for short lists. I think it would help if comments were added to point out which parts are logically necessary, and which just to speed typical cases. |
Tim, thanks for looking at this. It is not aways clear how far we should go. |
No argument here! IMO, "too many" changes go in that add too much complexity for too little gain. The changes here are conceptually simple, though, and the "1.31x faster" overall result looks credible, well-motivated, and more robust than the kinds of changes that make seemingly random changes to C spellings that yield seeming random "1.04x faster" results on some platforms, but also minor slowdowns on others 😉. We're actually taking code off critical paths in this one. |
Thanks for the reviews! |
Looks like this PR introduced some reference leaks. Looks like there may be some leaks here: In line 122 and maybe in line 301. Please, check out or otherwise we will need to revert it if is not fixed in 24 hours as our buildbot policy (if confirmed that indeed the refleaks come from this PR). |
Here's a benchmark program:
And the results: