Skip to content

[MRG+1] Fix incorrect predict_proba for LogisticRegression in binary case using multinomial parameter. #9939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 6, 2018

Conversation

rwolst
Copy link
Contributor

@rwolst rwolst commented Oct 17, 2017

Reference Issue

Fixes #9889

What does this implement/fix? Explain your changes.

Fixes incorrect predictions when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'.

Any other comments?

Copy link
Member

@TomDLT TomDLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only nitpicks

- Fixed a bug in :class:`linear_model.LogisticRegression` where when using the
parameter ``multi_class='multinomial'``, the ``predict_proba`` method was
returning incorrect probabilities in the case of binary outcomes.
:issue:`9889`. By user `rwolst`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the syntax:

:issue:`9889` by :user:`rwolst`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use this issue number, not the original issue. Easier to check logs are complete

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that from your 0.19.1 experience? I would say that, as a random user, the original issue (rather than the PR) is generally a better source of understanding what the problem was, so I would be mildly in favor of using the original issue in the whats_new entry. The PR is better to understand how the problem was fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for something like the 0.19.0 release I tried to check that the changeloga included everything merged that wasn't docs, CI, etc. That is easier with pr#s here. If GitHub had an API to get the PRs closing an issue for a particular issue this would not be difficult...

@@ -1102,13 +1102,18 @@ class LogisticRegression(BaseEstimator, LinearClassifierMixin,
Coefficient of the features in the decision function.

`coef_` is of shape (1, n_features) when the given problem
is binary.
is binary and in the case when `multi_class='multinomial'`, then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit verbose, even though it is a matter of taste.
What about:

        `coef_` is of shape (1, n_features) when the given problem is binary.
        In particular, when `multi_class='multinomial'`, `coef_` corresponds to
        outcome 1 (True) and `-coef_`  corresponds to outcome 0 (False).


intercept_ : array, shape (1,) or (n_classes,)
Intercept (a.k.a. bias) added to the decision function.

If `fit_intercept` is set to False, the intercept is set to zero.
`intercept_` is of shape(1,) when the problem is binary.
`intercept_` is of shape(1,) when the problem is binary and in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:

        `intercept_` is of shape(1,) when the problem is binary, and
        when `multi_class='multinomial'`, it corresponds to outcome 1 (True),
        while `-intercept` corresponds to outcome 0 (False).

@TomDLT TomDLT changed the title [MRG] Fix incorrect predict_proba for LogisticRegression in binary case using multinomial parameter. [MRG+1] Fix incorrect predict_proba for LogisticRegression in binary case using multinomial parameter. Oct 18, 2017
@rwolst
Copy link
Contributor Author

rwolst commented Oct 26, 2017

@TomDLT I agree with your comments, I will update the pull request.

On top of the merge, added the changes suggested in pull request.
@lesteve
Copy link
Member

lesteve commented Nov 8, 2017

Can you add a non-regression test please?

- Fixed a bug in :class:`linear_model.LogisticRegression` where when using the
parameter ``multi_class='multinomial'``, the ``predict_proba`` method was
returning incorrect probabilities in the case of binary outcomes.
:issue:`9939` by user `rwolst`.
Copy link
Member

@TomDLT TomDLT Nov 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the following syntax (with the two colons :), which will render as a link to your GitHub profile:

:user:`rwolst`

@rwolst
Copy link
Contributor Author

rwolst commented Nov 8, 2017

@lesteve Maybe I am mistaken about a non-regression test, but I think that is what test_ovr_multinomial_iris_binary() is i.e. it failed for master but passes for the new branch.

@lesteve
Copy link
Member

lesteve commented Nov 8, 2017

@lesteve Maybe I am mistaken about a non-regression test, but I think that is what test_ovr_multinomial_iris_binary() is i.e. it failed for master but passes for the new branch.

Sorry I must have missed it in the diff somehow.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

- Fixed a bug in :class:`linear_model.LogisticRegression` where when using the
parameter ``multi_class='multinomial'``, the ``predict_proba`` method was
returning incorrect probabilities in the case of binary outcomes.
:issue:`9939` by :user: `rwolst`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No space between colon and backtick please. You're also welcome to include your real name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwolst you can find an example of the syntax to use in the previous entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fix this in next commit.

Copy link
Member

@lesteve lesteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments

return super(LogisticRegression, self)._predict_proba_lr(X)
elif self.coef_.shape[0] == 1:
Copy link
Member

@lesteve lesteve Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it slightly clearer to check the shape of the decision function, i.e. something like:

else:
    decision == self.decision_function(X)
    decision_2d = np.c_[-decision, decision] if decision.ndim == 1 else decision
    return softmax(decision_2d, copy=False)

@@ -565,6 +565,38 @@ def test_ovr_multinomial_iris():
assert_equal(scores.shape, (3, n_cv, 10))


def test_ovr_multinomial_iris_binary():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a valid regression test but I feel something like this would be simpler:

import numpy as np

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

X, y = make_classification()
clf = LogisticRegression(multi_class='multinomial', solver='saga')
clf.fit(X, y)
decision = clf.decision_function(X)
proba = clf.predict_proba(X)
expected_proba_class_1 = (np.exp(decision) /
                          (np.exp(-decision) + np.exp(decision)))
expected_proba = np.c_[1 - expected_proba_class_1, expected_proba_class_1]
np.testing.assert_allclose(proba, expected_proba)

Basically you only check the relationship between decision_function and predict_proba rather than some consequences further down the line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want these addressed or should we merge given the two +1? (I haven't looked at it myself). @rwolst are you still working on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the @lesteve regression test is simpler and will add that. Do you want to keep the old regression test as it still may be useful?

@jnothman
Copy link
Member

jnothman commented Jan 1, 2018 via email

@jnothman
Copy link
Member

jnothman commented Jan 6, 2018

Thanks a lot, @rwolst. Merging!

@jnothman jnothman merged commit 4dafa52 into scikit-learn:master Jan 6, 2018
@jnothman
Copy link
Member

As per #11476 (comment) and subsequent discussion, we are reconsidering this. We're no longer persuaded that this is the right fix, but rather that we should always use the binary logistic formulation when the input is binary. This would be consistent with the existing behaviour of liblinear (and more generally with multi_class='ovr'), and with the literal meaning of multi_class=* (i.e. that setting shouldn't affect the binary case). Please let us know ASAP, @rwolst, if you have a basis for disagreeing with that conclusion. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants