-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: use caching memory allocator in more places #8920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Does this now mean that |
Also,
This no longer strikes me as |
yes, they should now be all gone. that (undocumented ...) code exists so any dimension allocation and free creates an entry for the array metadata allocation to use. These always need one for the dimension and one for the strides. |
Seems risky to me, as now someone could try and change the strides with:
Which would presumably leave the cache in an invalid state of some kind? I think we need to change the implementation of As a result, the cache pointers now need to know their own size, as |
/* | ||
* make sure any temporary allocation can be used for array metadata which | ||
* uses one memory block for both dimensions and strides | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still don't follow here - how do we end up on this code path? To me, it seems like we always passed an even number in the past.
So the only situation is when sz == 0
. Why do we allocate any memory at all? Why would you ask for 0 bytes then proceed to write two of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most important reason for existence of the cache is the npy_alloc_cache_dim(2 * nd)
allocation in PyArray_NewFromDescr_int
. In the important cases (small arrays) nd is often one.
To fill the cache with more values for arrays to use all other cache paths also allocate at least two entries.
It is probably not a particularly important optimization and could be removed with hardly an impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but before this PR that was the only cache path.
So is this really a performance optimization so that 0d arrays preallocate data for 1d arrays? Or is this just working around calling malloc(0)
, which might return NULL
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, we'll be missing the existing cache a lot more here since dims alone can be odd - is it worth looking always looking up in the cache with sz | 1
or something, so that the lookup is always odd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm right it there was only one path, it was premature optimization then. I probably planned to do this change a lot sooner ;)
for something to miss one already has to deal with 3d arrays. these are pretty unlikely to be so small that the cache is relevant.
The cache really does not need to be perfect. It just needs to cover the extremely common alloc + immediate dealloc loop which it does.
The cache is designed so it cannot be put into an invalid state. It caches malloc/free pointers without any metadata, so you can continue to use free on something that comes from the cache. |
Doesn't this leave it in the cache though, which uses up a slot forever? I guess you're right though, it's not strictly invalid |
alloc_cache removes the pointer from the cache, free_cache adds it to the cache. |
PyArray_IntpConverter passes a pointer to the user so external users can starve the cache with it. But it is used sufficiently in numpys important functions so it shouldn't be an issue in practice. |
☔ The latest upstream changes (presumably #8885) made this pull request unmergeable. Please resolve the merge conflicts. |
be165b0
to
495ed47
Compare
☔ The latest upstream changes (presumably #9202) made this pull request unmergeable. Please resolve the merge conflicts. |
In particular use it in PyArray_IntpConverter which is used in many function entrypoints.
Remove unlikely, as it is being used now more it may not be correct anymore.
495ed47
to
c114169
Compare
any comments? |
In particular use it in PyArray_IntpConverter which is used in many
function entrypoints.