-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Label spreading bug fix #15946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Label spreading bug fix #15946
Conversation
Thanks for the pull request. Please add a test |
Test added. :) |
@@ -289,6 +289,8 @@ def fit(self, X, y): | |||
) | |||
self.n_iter_ += 1 | |||
|
|||
l_bool_zeros = np.sum(self.label_distributions_, axis=1) == 0 | |||
self.label_distributions_[l_bool_zeros, :] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you modifying label_distributions_
and not just normalizer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because normalizer
is computed from self.label_distributions_
, when the latter is all zeros, also the former is zero and then it tries to performs 0/0. Maybe it is better to modify only normalizer
to set it to one when self.label_distributions_
is all zeros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I would suggest.
The check codecov/patch has failed. How can I fix it? |
@@ -290,6 +290,7 @@ def fit(self, X, y): | |||
self.n_iter_ += 1 | |||
|
|||
normalizer = np.sum(self.label_distributions_, axis=1)[:, np.newaxis] | |||
normalizer = np.array([[1] if x[0] == 0 else x for x in normalizer]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are more idiomatic numpy ways to do this. Indeed, using np.clip might be sufficient (the replacement does not need to be 1, can be any non zero number).
But normaliser[normaliser == 0] = 1]
might be sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right it was sufficient to use normalizer[normalizer == 0] = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need a change log entry if we are merely hiding a warning..wdyt?
Yes, it is only hiding a warning, otherwise it would return nan on some predictions. Clicked on review request by mistake, just ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I added the link to the PR.
Thanks @ngshya |
It can happen that the variable "normalizer" contains zeros and therefore the division below can fail. The line of the code added here avoids the division by zero and preserves the logic of the model.