-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] Fix memory leak in Barnes-Hut SNE #5983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@AlexanderFabisch I have got the tests of #5689 to pass. The fix was to pass random state properly. But I still don't think that should interfere with SNE in any way. |
I adapted your snippet to monitor memory usage with psutil and the leak seems to be fixed: import psutil
import numpy as np
from sklearn.manifold import TSNE
X = np.random.rand(10, 10)
p = psutil.Process()
for i in range(100000):
TSNE().fit(X)
if i % 100 == 0:
print(p.memory_info().rss / 1e6) +1 for merge on my side. |
@vighneshbirodkar can you detect a memory usage increase with my above snippet in your environment? Can you try both on master and with this PR? |
@ogrisel |
Alright, it's good to fix valgrind reported issues anyway. So still +1. |
free(tree.root_node.left_edge) | ||
free(tree.root_node.center) | ||
free(tree.root_node.barycenter) | ||
free(tree.root_node.leaf_point_position) | ||
free(tree.root_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just wondering if it would not be cleaner to re-arrange the way things are done in the free_recursive
loop so that we don't need to duplicate this sequence of free for the root in the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexanderFabisch have you seen my previous comment? Would you like to address it or let someone else to take over from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not replying. I have no time to do the refactoring before March probably. It should not be too difficult though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ogrisel
Do you mean duplication on these lines ?
https://github.com/AlexanderFabisch/scikit-learn/blob/fix_bhsne/sklearn/manifold/_barnes_hut_tsne.pyx#L399
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vighneshbirodkar yes they are duplicated in free_tree
. I think the organization of free_recursive
could be changed so that we would not have to repeat those lines in free_tree
.
That seems to fix #5916 for me. @vighneshbirodkar still seems to have problems.
Here is a test program:
You can run it with
valgrind --leak-check=full --track-origins=yes python test_tsne.py 2> logfile
and search for "barnes" in the logfile.