Skip to content

[MRG] Fix #13134: Ensure sorted bin edges for KBinsDiscretizer strategy kmeans #13135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 12, 2019
Merged

[MRG] Fix #13134: Ensure sorted bin edges for KBinsDiscretizer strategy kmeans #13135

merged 1 commit into from
Feb 12, 2019

Conversation

SandroCasagrande
Copy link
Contributor

Reference Issues/PRs

Fixes #13134

What does this implement/fix? Explain your changes.

Since centers returned by 1d kmeans can be unsorted, I simply added a sort before constructing the bin edges. I also added a test which reproduces the issue when not sorting centers.

Any other comments?

Performance of sorting centers should not be an issue compared to kmeans itself. In most circumstances the list of centers is already sorted, but afaik in this situation checking if the numpy array is sorted is not better than invoking sort() directly (https://stackoverflow.com/questions/47004506/check-if-a-numpy-array-is-sorted), so I skipped the check.

@SandroCasagrande SandroCasagrande changed the title Fix #13134: Ensure sorted bin edges for KBinsDiscretizer strategy kmeans [MRG] Fix #13134: Ensure sorted bin edges for KBinsDiscretizer strategy kmeans Feb 11, 2019
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not confirmed the test fails at master, but this lgtm

Please add an entry to the change log at doc/whats_new/v0.20.rst under 0.20.3. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@qinhanmin2014
Copy link
Member

I've not confirmed the test fails at master

It fails at master

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @SandroCasagrande

@qinhanmin2014 qinhanmin2014 merged commit 9ac5793 into scikit-learn:master Feb 12, 2019
@qinhanmin2014 qinhanmin2014 added this to the 0.20.3 milestone Feb 18, 2019
@jnothman jnothman mentioned this pull request Feb 19, 2019
17 tasks
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KBinsDiscretizer: kmeans fails due to unsorted bin_edges
3 participants