[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

yl565 · 2016-09-23T17:27:11Z

This change is

agramfort · 2016-09-25T19:21:08Z

sklearn/naive_bayes.py

+    th = 1e30
+    p = np.asarray(p)
+    if (p > 1).any() or (p < 0).any():
+        raise ValueError('Input `p` must be within [0, 1] range!')


if this happens for numerical reasons it does not explain how to make it work. Besides change the data.

why not clipping systematically?

@agramfort Thanks for your comment. It just happens when some elements contains inf (or very close to overflow which leads to inf), np.dot or np.array.sum may result in nan. Some simple examples:

In[3]: np.inf - np.inf Out[3]: nan In[7]: 1e+308 + 1e+308 Out[7]: inf In[8]: (1e+308 + 1e+308) - (1e+308 + 1e+308) Out[8]: nan

Since we are calculating log probability, I think it is reasonable to use a large negative number like -1e30 to replace log(0) = -inf when probability is 0.

why not clipping systematically?

I'm not sure I understand but do you mean creating wrappers for numpy to clip all values?

I think he means clipping p to be between 0 and 1

amueller · 2016-10-17T23:07:11Z

something went wrong with your rebase...

yl565 · 2016-10-24T13:03:08Z

@amueller I fixed the rebase problem. Also, I now adopted a simple solution: clip alpha to be never smaller than _ALPHA_MIN = 1e-10. This avoids the log(prob) = -inf when prob=0 problems and seems works well. Alternatively, I'm thinking maybe we can treat alpha=0 as a special case and uses a different numeric solution.

amueller · 2016-10-24T14:31:10Z

sklearn/naive_bayes.py

@@ -680,10 +684,21 @@ class MultinomialNB(BaseDiscreteNB):
    """

    def __init__(self, alpha=1.0, fit_prior=True, class_prior=None):
-        self.alpha = alpha
+        self._alpha = alpha


this violates scikit-learn API. Please check the setting of alpha in fit. Also, only clip it when it is used, and don't change the value of self.alpha.

dalmia · 2017-02-18T10:10:26Z

@yl565 Are you stilll working on this?

yl565 · 2017-02-18T15:31:52Z

Go ahead if you want to work on it

jnothman · 2017-06-14T02:53:08Z

I think we need another contributor, or can you make the minor fixes and add a what's new entry, @yl565?

jmschrei · 2017-06-19T19:33:44Z

Fixed via #9131

yl565 changed the title ~~Fix MultinomialNB and BernoulliNB alpha=0 bug~~ [MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug Sep 24, 2016

agramfort reviewed Sep 25, 2016

View reviewed changes

yl565 added 6 commits October 17, 2016 18:54

Fix scikit-learn#5814

d3bb0ec

Fix pep8 in naive_bayes.py:716

14a9baf

Fix sparse matrix incompatibility

1468ab5

Fix python 2.7 problem in test_naive_bayes

ae281a4

Make sure the values are probabilities before log transform

0200f8c

Improve docstring of _safe_logprob

4fdc5d7

yl565 force-pushed the fix_NB_5814 branch from 32d8738 to 4fdc5d7 Compare October 24, 2016 02:53

yl565 added 2 commits October 24, 2016 08:53

Clip alpha solution

830317d

Clip alpha solution

36a9f51

amueller reviewed Oct 24, 2016

View reviewed changes

amueller mentioned this pull request Oct 24, 2016

Stronger common tests for setting init params? / check_estimator #7738

Closed

jnothman added Need Contributor Easy Well-defined and straightforward way to resolve Waiting for Reviewer labels Jun 14, 2017

jnothman modified the milestones: 0.20, 0.19 Jun 14, 2017

herilalaina mentioned this pull request Jun 15, 2017

[MRG+1] Fix MultinomialNB and BernoulliNB alpha=0 bug (continuation) #9131

Merged

jmschrei closed this Jun 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

yl565 commented Sep 23, 2016 •

edited by amueller

Loading

agramfort Sep 25, 2016

yl565 Sep 25, 2016

amueller Sep 30, 2016

amueller commented Oct 17, 2016

yl565 commented Oct 24, 2016

amueller Oct 24, 2016

dalmia commented Feb 18, 2017

yl565 commented Feb 18, 2017

jnothman commented Jun 14, 2017

jmschrei commented Jun 19, 2017

[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

[MRG] Fix MultinomialNB and BernoulliNB alpha=0 bug #7477

Conversation

yl565 commented Sep 23, 2016 • edited by amueller Loading

agramfort Sep 25, 2016

Choose a reason for hiding this comment

yl565 Sep 25, 2016

Choose a reason for hiding this comment

amueller Sep 30, 2016

Choose a reason for hiding this comment

amueller commented Oct 17, 2016

yl565 commented Oct 24, 2016

amueller Oct 24, 2016

Choose a reason for hiding this comment

dalmia commented Feb 18, 2017

yl565 commented Feb 18, 2017

jnothman commented Jun 14, 2017

jmschrei commented Jun 19, 2017

yl565 commented Sep 23, 2016 •

edited by amueller

Loading