-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX LabelPropagation handling of sparce matrices #17085 #17384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @goerch . This also needs a regression test specific to the issue it fixes.
@@ -190,7 +190,7 @@ class labels. | |||
""" | |||
check_is_fitted(self) | |||
|
|||
X_2d = check_array(X, accept_sparse=['csc', 'csr', 'coo', 'dok', | |||
X_2d = check_array(X, accept_sparse=['csc', 'csr', 'coo', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this use _validate_data
? ping @NicolasHug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also really not sure why this fixes the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this use _validate_data
We only use it in fit
for now, so check_array
is fine here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @adrinjalali: PR #17085 already contains tests extended to the sparse case. Could you please advise a newbie regarding regression tests?
clf = label_propagation.LabelSpreading(max_iter=10000, alpha=alpha) | ||
clf.fit(X, y) | ||
assert_array_almost_equal(expected, clf.label_distributions_, 4) | ||
for sparse_or_dense in SPARSE_OR_DENSE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please parameterize the test for this one and alpha
? Also for the other tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrinjalali: I didn't write the original tests, so I'm probably not the right one to ask for improving these with regard to parametrization.
I am taking over this issue in #19664 |
I am closing this pull request as superseded by #19664. |
Reference Issues/PRs
Fixes #17085
What does this implement/fix? Explain your changes.
Label propagation and spreading allow to classify using sparse data according to documentation. Tests only covered the dense case. Newly added coverage for sparse matrices allows to reproduce the problem in #17085. The proposed fix in #17085 works for the extended tests.
Any other comments?
test_label_propagation_closed_form
fails for sparse matrices