-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-100758: Refactor initialisation of frame headers into a single function (_PyFrame_Initialize) #100759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…le function (_PyFrame_InitializeHeader)
This looks OK to me, pending benchmarking. It means we will do one or two extra NULL writes to localsplus entries that are about to get written again with a value, in some hot paths that use . (This could be avoided with an extra integer argument.) Will wait and see what Mark thinks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @carljm 's suggestion of passing in the index of the first local to NULL out.
I'd like to see the benchmarks.
If benchmarking shows no difference, then go with this approach as it is the cleanest.
These are benchmarking results from before I changed to pass the int arg: (overall x0 faster). I will rerun with the last change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, presuming the second benchmark run will also show no impact.
The results for this version are here: https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20230105-3.12.0a3%2B-d501577/bm-20230105-linux-x86_64-iritkatriel-InitializeHeader-3.12.0a3+-d501577-vs-base.md 0x00 slower. |
Might as well go with the faster version. |
You don't think this is noise? |
I created #100775 with the arg removed, will run the benchmark one more time. |
I'm merging this, as the performance difference with this and #100775 is in the noise, and this seems the more efficient design. |
Fixes #100758.
This is version 1 (with an arg telling where to start NULLing).