FIX fixes memory leak seen in PyPy in C losses #27670

glemaitre · 2023-10-26T09:10:02Z

partially addresses #27662

Avoid to use np.asarray that creates a reference to the memory view and seems to not be garbage collected in PyPy.

github-actions · 2023-10-26T09:12:01Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 32e9f17. Link to the linter CI: here}

lesteve · 2023-10-26T13:01:41Z

PyPy tests go until 98%, almost there!

betatim · 2023-10-27T08:33:17Z

Is the plan to merge this even before we get to 100%? From a quick look at the diff it looks uncontroversial. I think technically it changes the API of the loss functions (they now return None) but I think they are internal so it isn't something we need to worry about.

@lorentzenchr do you want to have a look and think about this (I think you wrote a lot of this code?)?

lorentzenchr · 2023-10-29T16:47:55Z

We can remove those return statements, I guess the only use is testing and the Python loss.py submodule.
I had them there because I don’t like Fortran style coding and it made using those functions more Pythonic.

The docstrings need to be modified, too.

betatim · 2023-10-30T08:34:02Z

I assume the fortran style is needed to make sure we re-use the same array/memory each time and results in a significant performance improvement. Is that the right assumption? I don't know enough about the history of this code and the discussions around it, so if this has been discussed before: I don't want to re-start an old discussion, just check that someone thought about this topic. Because I agree having a more pythonic API would be nicer.

lorentzenchr · 2023-10-30T09:15:43Z

I assume the fortran style is needed to make sure we re-use the same array/memory each time and results in a significant performance improvement.

We want to achieve ufunc like behaviour, e.g. np.add(a, b, out=c) and provide array c instead of creating it each time (c can be element-wise loss or a gradient). This saves memory and the time to allocate that memory each time. It also suits perfectly in our fitting alogs, where we usually only need the current gradient (and sometimes the previous one, but not more).

glemaitre · 2023-10-30T09:19:33Z

This saves memory and the time to allocate that memory each time

I also like the fact that we allocate the memory via NumPy in Python usually and just interact with it. We don't have to free the buffer ourselves.

sklearn/_loss/_loss.pyx.tp

Co-authored-by: Tim Head <betatim@gmail.com>

sklearn/_loss/_loss.pyx.tp

Co-authored-by: Tim Head <betatim@gmail.com>

lesteve · 2023-10-31T09:43:32Z

Merging, thanks! Let's see if the PyPy build manages to complete from time to time with this fix 🤞 !

Co-authored-by: Tim Head <betatim@gmail.com>

glemaitre added 2 commits October 26, 2023 11:07

FIX fixes memory leak seen in PyPy in C losses

d83f731

[pypy] trigger pypy CI

04f3e65

github-actions bot added the cython label Oct 26, 2023

glemaitre mentioned this pull request Oct 26, 2023

PyPy tests timeouts / memory usage investigation #27662

Closed

[pypy] change pr number

7238216

DOC update returns section of the modify function

7169abc

betatim reviewed Oct 30, 2023

View reviewed changes

sklearn/_loss/_loss.pyx.tp Outdated Show resolved Hide resolved

glemaitre and others added 2 commits October 30, 2023 11:44

Update sklearn/_loss/_loss.pyx.tp

e418da8

Co-authored-by: Tim Head <betatim@gmail.com>

modify other occurrences

0596a9f

betatim reviewed Oct 31, 2023

View reviewed changes

sklearn/_loss/_loss.pyx.tp Outdated Show resolved Hide resolved

betatim approved these changes Oct 31, 2023

View reviewed changes

Update sklearn/_loss/_loss.pyx.tp

32e9f17

Co-authored-by: Tim Head <betatim@gmail.com>

lesteve merged commit a5fed0d into scikit-learn:main Oct 31, 2023

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

FIX fixes memory leak seen in PyPy in Cython losses (scikit-learn#27670)

79aa243

Co-authored-by: Tim Head <betatim@gmail.com>

lesteve mentioned this pull request Jan 23, 2024

Rebuild for PyPy3.9 conda-forge/scikit-learn-feedstock#248

Closed

lesteve mentioned this pull request Mar 5, 2024

⚠️ CI failed on Linux_Nightly_PyPy.pypy3 ⚠️ #28391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX fixes memory leak seen in PyPy in C losses #27670

FIX fixes memory leak seen in PyPy in C losses #27670

glemaitre commented Oct 26, 2023

github-actions bot commented Oct 26, 2023 •

edited

Loading

lesteve commented Oct 26, 2023

betatim commented Oct 27, 2023

lorentzenchr commented Oct 29, 2023

betatim commented Oct 30, 2023

lorentzenchr commented Oct 30, 2023

glemaitre commented Oct 30, 2023

lesteve commented Oct 31, 2023 •

edited

Loading

FIX fixes memory leak seen in PyPy in C losses #27670

FIX fixes memory leak seen in PyPy in C losses #27670

Conversation

glemaitre commented Oct 26, 2023

github-actions bot commented Oct 26, 2023 • edited Loading

✔️ Linting Passed

lesteve commented Oct 26, 2023

betatim commented Oct 27, 2023

lorentzenchr commented Oct 29, 2023

betatim commented Oct 30, 2023

lorentzenchr commented Oct 30, 2023

glemaitre commented Oct 30, 2023

lesteve commented Oct 31, 2023 • edited Loading

github-actions bot commented Oct 26, 2023 •

edited

Loading

lesteve commented Oct 31, 2023 •

edited

Loading