-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Break cyclic references in Histogram GBRT #18334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Break cyclic references in Histogram GBRT #18334
Conversation
Hm I am getting |
That's weird... maybe my Python GC is smarter than yours? I am running Python 3.8.3. |
Why do you say this PR is less reliable @ogrisel? |
I re-ran the snippet several times (restarting Python each time) and I consistently get ~140 MB increments instead of the 790.27 MB observed by @thomasjpfan on his machine. |
@thomasjpfan still observes a 790.27 MB increment with this PR on his machine so the GC does not seem to free memory as fast as needed. Therefore, explicit memory management as done in #18242 might still be useful for some users. |
I too consistently observe about 140MB on this PR with Python 3.8.2, on 2 different machines. (1, 4, and 16 threads. Not that it should matter anyway.) @thomasjpfan can you double check the benchmark you ran make sure you used the same snippet above? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ogrisel
I have a preference for this solution considering its simplicity and it because of its "unitaryness" (brain is frozen, can't think of a word)
We can still use np.empty
and initialize histograms in parallel. However I'd rather leave that for a different PR. I feel like #18242 does too many things at once and it's hard to determine the effect of each change.
My only concern here is: how do we prevent cyclic refs from sneaking in in the future? I guess we should write a memory benchmark in asv
?
I agree. It was in the back of my head but I wanted to wait a bit before adding it. @thomasjpfan what's your feeling w.r.t. the need to introduce the |
I think we should:
If 1 does its job then I doubt 3 would add a significant memory improvement but it might add a small speed improvement, since it will remove the need from allocating memory each time. |
Let's merge this one first.
|
Thanks @thomasjpfan for the investigation. Let me add a what's new entry to this PR if you want to merge it as it is. However the results on macOS makes me think that we will need to add the |
Done. Feel free to merge and then rework #18242 on top of this... |
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Resolves #18152
This is an alternative to #18163 and #18242
This is a minimal fix to:
TreeNode
instance as soon as no longer needed (as done in HistGradientBoosting memory improvement #18163).For the reproducer snippet of #18242:
I get:
(instead of 6GB+ on master).