-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/vm.c: Avoid heap-allocating slices for built-ins. #12518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Code size report:
|
Not sure why, but this is failing on unix CI. |
It's failing the "doesn't heap allocate" when using emit-native. Need to add an exclusion for that test when using emit-native. |
Done. |
Codecov Report
@@ Coverage Diff @@
## master #12518 +/- ##
=======================================
Coverage 98.39% 98.39%
=======================================
Files 158 158
Lines 20939 20946 +7
=======================================
+ Hits 20602 20609 +7
Misses 337 337
|
// this for instance types because the get/set/delattr | ||
// implementation may keep a reference to the slice. | ||
byte op = *ip++; | ||
mp_obj_slice_t slice = { .base = { .type = &mp_type_slice }, .start = start, .stop = stop, .step = step }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This possibly increases C stack usage of this function. Will need to quantify that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed... +24 bytes of stack.
I've added a commit to address this (using alloca). Slight code size increase though (was +120, now +152).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted this for now... too complicated to figure out how this works with MICROPY_ENABLE_PYSTACK
.
Fast path optimisation for when a BUILD_SLICE is immediately followed by a LOAD/STORE_SUBSCR for a native type to avoid needing to allocate the slice on the heap. In some cases (e.g. `a[1:3] = x`) this can result in no allocations at all. We can't do this for instance types because the get/set/delattr implementation may keep a reference to the slice. Adds more tests to the basic slice tests to ensure that a stack-allocated slice never makes it to Python, and also a heapalloc test that verifies (when using bytecode) that assigning to a slice is no-alloc. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
04fa3f6
to
ec15a0b
Compare
This is an alternative to #10160 and implements a different variation on @damz's idea. It's still worth considering making the compiler do this (via a new
SLICE_AND_SUBSCR
opcode), but this way also has the benefit of handing the is-instance-type check in a simple way.Unlike #10160 though it only helps for bytecode and not the native emitter.
Fast path optimisation for when a BUILD_SLICE is immediately followed by a LOAD/STORE_SUBSCR for a native type to avoid needing to allocate the slice on the heap.
In some cases (e.g.
a[1:3] = x
) this can result in no allocations at all.We can't do this for instance types because the get/set/delattr implementation may keep a reference to the slice.
This work was funded through GitHub Sponsors.