Skip to content

OPTICS should not call kneighbors on all the data at once #12098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jnothman opened this issue Sep 17, 2018 · 2 comments
Closed

OPTICS should not call kneighbors on all the data at once #12098

jnothman opened this issue Sep 17, 2018 · 2 comments

Comments

@jnothman
Copy link
Member

OPTICS only uses the last neighbor of its call to kneighbors

self.core_distances_[:] = nbrs.kneighbors(X,

It is therefore wasting memory on an array of shape (X.shape[0], min_samples), when it could be getting kneighbors results in multiple chunks (perhaps to ensure working_memory is not exceeded in any chunk) so that only fixed memory plus X.shape[0] distances are stored.

@kss682
Copy link

kss682 commented Sep 18, 2018

Hi
I have only been at the using end of sklearn , and would like to work on this issue.
How can I start?
Thanks

@adrinjalali
Copy link
Member

Hi @kss682, it's nice to have new people on board. You probably want to read the contributing page in case you haven't (http://scikit-learn.org/dev/developers/contributing.html). And to take an issue, you should always check if somebody else is working on it or not, and this one has already been worked on in the PR #12103 you see linked here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants