Label spreading bug fix #15946

ngshya · 2019-12-21T19:33:57Z

It can happen that the variable "normalizer" contains zeros and therefore the division below can fail. The line of the code added here avoids the division by zero and preserves the logic of the model.

jnothman · 2019-12-22T02:23:42Z

Thanks for the pull request. Please add a test

ngshya · 2019-12-22T10:24:40Z

Thanks for the pull request. Please add a test

Test added. :)

jnothman · 2020-01-05T20:34:46Z

sklearn/semi_supervised/_label_propagation.py

@@ -289,6 +289,8 @@ def fit(self, X, y):
            )
            self.n_iter_ += 1

+        l_bool_zeros = np.sum(self.label_distributions_, axis=1) == 0
+        self.label_distributions_[l_bool_zeros, :] = 1


Why are you modifying label_distributions_ and not just normalizer?

because normalizer is computed from self.label_distributions_, when the latter is all zeros, also the former is zero and then it tries to performs 0/0. Maybe it is better to modify only normalizer to set it to one when self.label_distributions_ is all zeros?

Yes, that is what I would suggest.

ngshya · 2020-01-08T08:37:00Z

The check codecov/patch has failed. How can I fix it?

jnothman · 2020-01-08T21:32:05Z

sklearn/semi_supervised/_label_propagation.py

@@ -290,6 +290,7 @@ def fit(self, X, y):
            self.n_iter_ += 1

        normalizer = np.sum(self.label_distributions_, axis=1)[:, np.newaxis]
+        normalizer = np.array([[1] if x[0] == 0 else x for x in normalizer])


I think there are more idiomatic numpy ways to do this. Indeed, using np.clip might be sufficient (the replacement does not need to be 1, can be any non zero number).

But normaliser[normaliser == 0] = 1] might be sufficient

you're right it was sufficient to use normalizer[normalizer == 0] = 1

jnothman

I'm not sure if we need a change log entry if we are merely hiding a warning..wdyt?

ngshya · 2020-01-09T15:19:39Z

I'm not sure if we need a change log entry if we are merely hiding a warning..wdyt?

Yes, it is only hiding a warning, otherwise it would return nan on some predictions.

Clicked on review request by mistake, just ignore it.

glemaitre

LGTM. I added the link to the PR.

glemaitre · 2020-01-09T18:12:43Z

Thanks @ngshya

ngshya added 3 commits December 21, 2019 20:17

Managed the case where the sum by row is zero.

2706b2e

Managed the case where the sum by row is zero.

2fd9ccb

Managed the case where the sum by row is zero.

101e125

ngshya added 3 commits December 22, 2019 10:58

Added the non regression test for scikit-learn#15946.

f99bf29

Solved the issue of long line.

bce6500

Removed trailing space.

607f31f

ngshya added 3 commits December 27, 2019 23:35

Merge branch 'master' into label_spreading_bug_fix

3d0556d

removed blank line at the end of the file

8876b74

remove blank lines with spaces

38b1e58

jnothman reviewed Jan 5, 2020

View reviewed changes

modified normalizer instead of label_distributions_

855e988

jnothman reviewed Jan 8, 2020

View reviewed changes

more idiomatic numpy way to manage normalizer == 0

a70d944

jnothman approved these changes Jan 8, 2020

View reviewed changes

glemaitre self-requested a review January 9, 2020 14:34

ngshya requested a review from jnothman January 9, 2020 15:05

nitpicks

2db2541

glemaitre approved these changes Jan 9, 2020

View reviewed changes

glemaitre merged commit 50d3fe9 into scikit-learn:master Jan 9, 2020

glemaitre removed the request for review from jnothman January 9, 2020 18:12

ngshya deleted the label_spreading_bug_fix branch January 9, 2020 18:59

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 22, 2020

FIX avoid division by 0 warning in LabelPropagation (scikit-learn#15946)

31befed

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020

FIX avoid division by 0 warning in LabelPropagation (scikit-learn#15946)

f3441b9

This was referenced Apr 23, 2020

DOC cleaning up to 0.23/whats new #17015

Merged

Fix 0.23 whats_new entries #17057

Closed

ThuWangzw mentioned this pull request Jan 27, 2021

FIX nan bug in BaseLabelPropagation #19271

Merged

thomasjpfan mentioned this pull request Apr 1, 2022

TST replace pytest.warns(None) in test_label_propagation.py #23010

Merged

thomasjpfan mentioned this pull request Apr 12, 2022

Fixes divide by zero error / propagation problem #15635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label spreading bug fix #15946

Label spreading bug fix #15946

ngshya commented Dec 21, 2019

jnothman commented Dec 22, 2019

ngshya commented Dec 22, 2019

jnothman Jan 5, 2020

ngshya Jan 7, 2020

jnothman Jan 8, 2020

ngshya commented Jan 8, 2020

jnothman Jan 8, 2020

ngshya Jan 8, 2020 •

edited

Loading

jnothman left a comment

ngshya commented Jan 9, 2020

glemaitre left a comment

glemaitre commented Jan 9, 2020

Label spreading bug fix #15946

Label spreading bug fix #15946

Conversation

ngshya commented Dec 21, 2019

jnothman commented Dec 22, 2019

ngshya commented Dec 22, 2019

jnothman Jan 5, 2020

Choose a reason for hiding this comment

ngshya Jan 7, 2020

Choose a reason for hiding this comment

jnothman Jan 8, 2020

Choose a reason for hiding this comment

ngshya commented Jan 8, 2020

jnothman Jan 8, 2020

Choose a reason for hiding this comment

ngshya Jan 8, 2020 • edited Loading

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

ngshya commented Jan 9, 2020

glemaitre left a comment

Choose a reason for hiding this comment

glemaitre commented Jan 9, 2020

ngshya Jan 8, 2020 •

edited

Loading