Skip to content

ENH: use caching memory allocator in more places #8920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 20, 2017

Conversation

juliantaylor
Copy link
Contributor

In particular use it in PyArray_IntpConverter which is used in many
function entrypoints.

@eric-wieser
Copy link
Member

Does this now mean that PyDimMem_NEW and PyDimMem_FREE are not used anywhere?

@eric-wieser
Copy link
Member

eric-wieser commented Apr 10, 2017

Also, npy_alloc_cache_dim contains:

/* dims + strides */
    if (NPY_UNLIKELY(sz < 2)) {
        sz = 2;
    }

This no longer strikes me as NPY_UNLIKELY now that we're not passing nd * 2, and it's not all that clear to me why this is here in the first place - is there a problem with allocating 0?

@juliantaylor
Copy link
Contributor Author

yes, they should now be all gone.

that (undocumented ...) code exists so any dimension allocation and free creates an entry for the array metadata allocation to use. These always need one for the dimension and one for the strides.
With the cache now being used everywhere the unlikely is indeed questionable, it should just be a minimum.

@eric-wieser
Copy link
Member

eric-wieser commented Apr 10, 2017

Seems risky to me, as now someone could try and change the strides with:

PyDimMem_FREE(arr->strides)
arr->strides = PyDimMem_NEW(nd);

Which would presumably leave the cache in an invalid state of some kind?

I think we need to change the implementation of PyDimMem_NEW and PyDimMem_FREE to use the cache.

As a result, the cache pointers now need to know their own size, as PyDimMem_FREE does not take that argument

/*
* make sure any temporary allocation can be used for array metadata which
* uses one memory block for both dimensions and strides
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't follow here - how do we end up on this code path? To me, it seems like we always passed an even number in the past.

So the only situation is when sz == 0. Why do we allocate any memory at all? Why would you ask for 0 bytes then proceed to write two of them?

Copy link
Contributor Author

@juliantaylor juliantaylor Apr 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most important reason for existence of the cache is the npy_alloc_cache_dim(2 * nd) allocation in PyArray_NewFromDescr_int. In the important cases (small arrays) nd is often one.
To fill the cache with more values for arrays to use all other cache paths also allocate at least two entries.
It is probably not a particularly important optimization and could be removed with hardly an impact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but before this PR that was the only cache path.

So is this really a performance optimization so that 0d arrays preallocate data for 1d arrays? Or is this just working around calling malloc(0), which might return NULL?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a similar note, we'll be missing the existing cache a lot more here since dims alone can be odd - is it worth looking always looking up in the cache with sz | 1 or something, so that the lookup is always odd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm right it there was only one path, it was premature optimization then. I probably planned to do this change a lot sooner ;)
for something to miss one already has to deal with 3d arrays. these are pretty unlikely to be so small that the cache is relevant.
The cache really does not need to be perfect. It just needs to cover the extremely common alloc + immediate dealloc loop which it does.

@juliantaylor
Copy link
Contributor Author

The cache is designed so it cannot be put into an invalid state. It caches malloc/free pointers without any metadata, so you can continue to use free on something that comes from the cache.

@eric-wieser
Copy link
Member

eric-wieser commented Apr 10, 2017

so you can continue to use free on something that comes from the cache.

Doesn't this leave it in the cache though, which uses up a slot forever? I guess you're right though, it's not strictly invalid

@juliantaylor
Copy link
Contributor Author

juliantaylor commented Apr 10, 2017

alloc_cache removes the pointer from the cache, free_cache adds it to the cache.
using free on a cachable pointer just means you are not filling the cache back up again, this just reduces the cache hit rate for allocations as you may starve the cache by getting pointers from it but not putting them back when you don't need them anymore.
Thus there are a couple places where cache allocation is not used as the pointer goes out to the user and we do not get it back again. Using cache allocation in these cases could starve the cache.

@juliantaylor
Copy link
Contributor Author

PyArray_IntpConverter passes a pointer to the user so external users can starve the cache with it. But it is used sufficiently in numpys important functions so it shouldn't be an issue in practice.

@homu
Copy link
Contributor

homu commented Apr 30, 2017

☔ The latest upstream changes (presumably #8885) made this pull request unmergeable. Please resolve the merge conflicts.

@homu
Copy link
Contributor

homu commented Jun 1, 2017

☔ The latest upstream changes (presumably #9202) made this pull request unmergeable. Please resolve the merge conflicts.

In particular use it in PyArray_IntpConverter which is used in many
function entrypoints.
Remove unlikely, as it is being used now more it may not be correct
anymore.
@juliantaylor
Copy link
Contributor Author

any comments?
if not I'll merge it in the next few days as this is so prone to conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants