Skip to content

ENH Remember predecessor in OPTICS #12135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 7, 2018

Conversation

kno10
Copy link
Contributor

@kno10 kno10 commented Sep 22, 2018

This will allow filtering the extracted clusters,
in order to improve quality (see Schubert and Gertz, LWDA 2018)

Reference Issues/PRs

Predecessor information was requested in #12049

What does this implement/fix? Explain your changes.

Store the predecessor of each point, which is useful information to filter certain difficult points when extracting clusters (see Schubert and Gertz, LWDA 2018).

Any other comments?

The extraction methods still need to be modified to use this.

This will allow filtering the extracted clusters,
in order to improve quality (see Schubert and Gertz 2018)
@adrinjalali
Copy link
Member

Awesome, I think it'd be nice if we had a test to check the predecessor_ values though.

@kno10
Copy link
Contributor Author

kno10 commented Sep 24, 2018

@adrinjalali predecessor, reachability, order (+ a consistency check for core sizes) are tested against ELKI in kno10@abaa35a

@adrinjalali
Copy link
Member

Oh yeah, didn't notice the other arrays in that test of yours when I had a look. Cool! Thanks @kno10

@kno10
Copy link
Contributor Author

kno10 commented Sep 25, 2018

I just added these tests last night, so you probably just saw the first commit of these.

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll check our implementation with ELKI & R dbscan and we deeply appreciate your contributions here.

@qinhanmin2014 qinhanmin2014 merged commit 4280308 into scikit-learn:master Oct 7, 2018
@adrinjalali
Copy link
Member

@qinhanmin2014 the test @kno10 to which he's referring IS the output of ELKI if I'm not mistaken. He also has the command with which you can reproduce the results, which is awesome, cause the filter you need to add to process the csv is not trivially explained in the ELKI docs.

I think we just need to either create a PR from that, or incorporate it into #12054.

@qinhanmin2014
Copy link
Member

Yes I've seen that. I think we might replace #12054 with that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants