Skip to content

Label spreading bug fix #15946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 9, 2020
Merged

Conversation

ngshya
Copy link
Contributor

@ngshya ngshya commented Dec 21, 2019

It can happen that the variable "normalizer" contains zeros and therefore the division below can fail. The line of the code added here avoids the division by zero and preserves the logic of the model.

@jnothman
Copy link
Member

Thanks for the pull request. Please add a test

@ngshya
Copy link
Contributor Author

ngshya commented Dec 22, 2019

Thanks for the pull request. Please add a test

Test added. :)

@@ -289,6 +289,8 @@ def fit(self, X, y):
)
self.n_iter_ += 1

l_bool_zeros = np.sum(self.label_distributions_, axis=1) == 0
self.label_distributions_[l_bool_zeros, :] = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you modifying label_distributions_ and not just normalizer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because normalizer is computed from self.label_distributions_, when the latter is all zeros, also the former is zero and then it tries to performs 0/0. Maybe it is better to modify only normalizer to set it to one when self.label_distributions_ is all zeros?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I would suggest.

@ngshya
Copy link
Contributor Author

ngshya commented Jan 8, 2020

The check codecov/patch has failed. How can I fix it?

@@ -290,6 +290,7 @@ def fit(self, X, y):
self.n_iter_ += 1

normalizer = np.sum(self.label_distributions_, axis=1)[:, np.newaxis]
normalizer = np.array([[1] if x[0] == 0 else x for x in normalizer])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are more idiomatic numpy ways to do this. Indeed, using np.clip might be sufficient (the replacement does not need to be 1, can be any non zero number).

But normaliser[normaliser == 0] = 1] might be sufficient

Copy link
Contributor Author

@ngshya ngshya Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right it was sufficient to use normalizer[normalizer == 0] = 1

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need a change log entry if we are merely hiding a warning..wdyt?

@glemaitre glemaitre self-requested a review January 9, 2020 14:34
@ngshya ngshya requested a review from jnothman January 9, 2020 15:05
@ngshya
Copy link
Contributor Author

ngshya commented Jan 9, 2020

I'm not sure if we need a change log entry if we are merely hiding a warning..wdyt?

Yes, it is only hiding a warning, otherwise it would return nan on some predictions.

Clicked on review request by mistake, just ignore it.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I added the link to the PR.

@glemaitre glemaitre merged commit 50d3fe9 into scikit-learn:master Jan 9, 2020
@glemaitre
Copy link
Member

Thanks @ngshya

@glemaitre glemaitre removed the request for review from jnothman January 9, 2020 18:12
@ngshya ngshya deleted the label_spreading_bug_fix branch January 9, 2020 18:59
thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 22, 2020
panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants