-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
t-SNE has inefficient memory structure #7089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does this still need a contributor? I'll work on this issue. |
It does. It's somewhat non-trivial but you're very welcome to give it a go! |
Ok, I will give it a go then. Glad to be helping on a 1.0 milestone! |
@shanglun Have you made any progress? I am also willing to contribute some code for this issue. :) |
Yes, I am still looking into it and I am making progress. I will reach out On Aug 8, 2016 11:31 AM, "Zhexuan Zachary Yang" notifications@github.com
|
Great. Please let me know if you need any help. :)
|
In this work, the authors proposed a new method for visualization of high-dimensional data (LargeVis) using ideas of a previous study (LINE). I did some test and the method works really fast. They proposed an interesting algorithm to build a pretty accurate kNN graph. In the experiment sections, the authors mentioned they parallelize parts of t-SNE, so maybe you could ask the authors to contribute. BTW, I'm looking forward to see LINE and LargeVis in future versions ;) |
Interesting. I think these studies might be out of scope for this Was your benchmarking written in Python? Or account language? On Aug 10, 2016 1:39 AM, "Claudio Sanhueza" notifications@github.com
|
I did the tests in C++. What are you specifically modifying in t-SNE? Memory management? |
Yeah, still investigating and experimenting, but based on discussions in If you have the C++ code I'd very much love to collaborate on a new On Aug 10, 2016 10:02 PM, "Claudio Sanhueza" notifications@github.com
|
If you just want to see another implementation option in t-Sne you can On Aug 10, 2016 10:11 PM, "Shanglun Wang" shanglunwang@gmail.com wrote:
|
The original implementation of t-SNE was done in C++. You can find the original sources here. I just adapt:
Originally, all the input data is contained in a specific formatted binary file created by the wrappers. |
Oh, I see, the improvements you mentioned in the original post was just I was under the impression that there was some new implementation of the Let's follow up when I finish this ticket, and we can look into optimizing On Aug 10, 2016 10:39 PM, "Claudio Sanhueza" notifications@github.com
|
Not very happy about this but I have hit a busy period with work and will On Aug 10, 2016 10:44 PM, "Shanglun Wang" shanglunwang@gmail.com wrote:
|
More discussion of this issue and its potential solution over at #8582 |
I'm so annoyed by this false advertising that I'm tempted to fix it myself. But given my lack of availability, I'm going to mark it for the sprint and hope that someone in Paris can give it a go. |
Maybe this can help to improve things. |
It is not so hard to improve our implementation. It just needs someone
confident and available to do it.
That implementation may remain faster; we could consider adopting its code,
with permission, but not with a cffi dependency (and perhaps not other
dependencies it builds on).
On 28 May 2017 12:02 pm, "Claudio Sanhueza" <notifications@github.com> wrote:
Maybe this can help to improve things.
https://github.com/DmitryUlyanov/Multicore-TSNE
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#7089 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz673pY211H63aDBjzJj0yMxb_p1H0ks5r-NW0gaJpZM4JVevI>
.
|
Fixed in #9032 |
The barnes-hut implementation of t-SNE currently uses a dense matrix representation for the distances, but should be using a sparse matrix.
Discussion see #4025
The text was updated successfully, but these errors were encountered: