Skip to content

[MRG + 1] Labels of clustering should start at 0 or -1 if noise #10015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 27, 2017

Conversation

albertcthomas
Copy link
Contributor

@albertcthomas albertcthomas commented Oct 26, 2017

Reference Issues/PRs

Addresses comments in PR #1984 about the first value of the clustering labels. It should start at 0 or -1 (if noise) for consistency.

What does this implement/fix? Explain your changes.

This adds a common test ensuring this behavior.

assert_greater_equal(pred_sorted[0], -1)
# labels_ should be less than n_clusters - 1
# Labels are expected to start at 0 (no noise) or -1 (if noise)
assert_true((labels_sorted[0] == -1) or (labels_sorted[0] == 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_true(labels_sorted[0] in [0, 1])

@agramfort
Copy link
Member

besides LGTM

assert_greater_equal(pred_sorted[0], -1)
# labels_ should be less than n_clusters - 1
# Labels are expected to start at 0 (no noise) or -1 (if noise)
assert_true(labels_sorted[0] in [0, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_true(labels_sorted[0] in [0, -1])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, Travis does not fail

@albertcthomas
Copy link
Contributor Author

thanks @TomDLT ! I added noise to the dataset to make the test fail with [0, 1].

@amueller
Copy link
Member

lgtm

@amueller amueller changed the title [MRG] Labels of clustering should start at 0 or -1 if noise [MRG + 1] Labels of clustering should start at 0 or -1 if noise Oct 26, 2017
@TomDLT
Copy link
Member

TomDLT commented Oct 26, 2017

LGTM, waiting for appveyor before merging

@jnothman
Copy link
Member

Might deserve a what's new for any clusterer developers. Maybe we should consider a what's new section Judy for estimator checks

@agramfort agramfort merged commit 1f7fa76 into scikit-learn:master Oct 27, 2017
@agramfort
Copy link
Member

thx @albertcthomas

donigian added a commit to donigian/scikit-learn that referenced this pull request Oct 28, 2017
…cs/donigian-update-contribution-guidelines

* 'master' of github.com:scikit-learn/scikit-learn: (23 commits)
  fixes scikit-learn#10031: fix attribute name and shape in documentation (scikit-learn#10033)
  [MRG+1] add changelog entry for fixed and merged PR scikit-learn#10005 issue scikit-learn#9633 (scikit-learn#10025)
  [MRG] Fix LogisticRegression see also should include LogisticRegressionCV(scikit-learn#9995) (scikit-learn#10022)
  [MRG + 1] Labels of clustering should start at 0 or -1 if noise (scikit-learn#10015)
  MAINT Remove redundancy in scikit-learn#9552 (scikit-learn#9573)
  [MRG+1] correct comparison in GaussianNB for 'priors' (scikit-learn#10005)
  [MRG + 1] ENH add check_inverse in FunctionTransformer (scikit-learn#9399)
  [MRG] FIX bug in nested set_params usage (scikit-learn#9999)
  [MRG+1] Fix LOF and Isolation benchmarks (scikit-learn#9798)
  [MRG + 1] Fix negative inputs checking in mean_squared_log_error (scikit-learn#9968)
  DOC Fix typo (scikit-learn#9996)
  DOC Fix typo: x axis -> y axis (scikit-learn#9985)
  improve example plot_forest_iris.py (scikit-learn#9989)
  [MRG+1] Deprecate pooling_func unused parameter in AgglomerativeClustering (scikit-learn#9875)
  DOC update news
  DOC Fix three typos in manifold documentation (scikit-learn#9990)
  DOC add missing dot in docstring
  DOC Add what's new for 0.19.1 (scikit-learn#9983)
  Improve readability of outlier detection example. (scikit-learn#9973)
  DOC: Fixed typo (scikit-learn#9977)
  ...
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…it-learn#10015)

* test labels of clustering should start at 0 or -1 if noise

* take into account agramfort's comment

* fix test
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
…it-learn#10015)

* test labels of clustering should start at 0 or -1 if noise

* take into account agramfort's comment

* fix test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants