Skip to content

COMPAT: notify garbage collector when memory is allocated #8907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 7, 2017
Merged

COMPAT: notify garbage collector when memory is allocated #8907

merged 1 commit into from
Apr 7, 2017

Conversation

mattip
Copy link
Member

@mattip mattip commented Apr 7, 2017

PyPy's garbage collector (GC) starts a cycle based on an algorithm that takes into account allocated memory. It must be told about allocations outside of python. This patch triggers more frequent GC cycles when ndarrays are allocated in a tight loop, which while not efficient will now at least run to completion without allocating all the free memory in the system.

@charris
Copy link
Member

charris commented Apr 7, 2017

Thanks @mattip .

@njsmith
Copy link
Member

njsmith commented Apr 7, 2017

Did you really mean PyPyPyGC? It's weird that the ifdef and call don't match...

I'm pretty sure this isn't the only place we allocate memory... (At the least there has to be both malloc and calloc calls.) Is it that the other places go through the PyMem APIs and are already covered, or...?

@mattip
Copy link
Member Author

mattip commented Apr 9, 2017

@njsmith The code as it stands is correct but admittedly misleading, the functions are aliases for one another. Explaination: PyPy prevents mixing CPython headers with the PyPy runtime by add an extra Py to the name exported in the shared object. For instance, _PyPyGC_AddMemoryPressure appears in the public headers and _PyPyPyGC_AddMemoryPressure is exported in the shared object. The header pypy_decl.h contains this magic.

I assumed all ndarray data allocations are done with _npy_alloc_cache (used by both npy_alloc_cache and npy_alloc_cache_zero. It seems that assumption is incorrect only in array_setstate which calls PyDataMem_NEW directly. AFAICT all the other calls are for short-lived buffers that are quickly released, or for small bits of memory needed to hold pointers.

Should array_setstate be using npy_alloc_cache rather than PyDataMem_NEW?

@juliantaylor
Copy link
Contributor

how does this work exactly? When the GC runs it just sets memory pressure back to zero?
If not shouldn't there be a remove pressure function in free?

@mattip
Copy link
Member Author

mattip commented Jun 1, 2017

The first answer is correct. Any GC tracked allocation (all PyPy internal allocations are tracked) is added to a counter, when the counter is large enough a GC collection cycle is triggered. The counter is reset after each cycle. Adding memory pressure simply increments the counter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants