[MRG] Normalize linear_model decision_function scores. #19142

stootoon · 2021-01-10T12:11:31Z

Reference Issues/PRs

Fixes #19139.

What does this implement/fix? Explain your changes.

According to the docs, the decision_function scores in LinearClassifierMixin are supposed to be the distances of samples to corresponding hyperplanes. (I assume) the hyperplanes are defined by the coefficients and intercepts arrays. The scores are computed as the scalar product of each sample with each of the coefficients, plus the intercepts. The problem is that without normalizing these scores by the norm of their corresponding coefficients, the scores won't actually be the signed distances to the corresponding hyperplanes. This is because the signed distance of a point p to a hyperplane defined by c'x + b = 0 is (c'p + b)/|c|, not the (c'p + b) currently computed.

I fix this problem by normalizing the scores by the norm of their corresponding coefficients.

Any other comments?

I ran pytest on linear_model and a few of the tests are failing, some because computed accuracies are being compared to hard-coded values. This may be expected if the hard-coded values reflect desired outputs using the previous, potentially incorrect, method of computing the scores.

Also, I haven't done any checking for division by zero which would occur if any of the coefficients are all zeros, because I wasn't sure what sklearn best practices are for doing so, and it will be easy enough for whoever does know to add this. Some tests are failing because NaNs are appearing, presumably due to such division by zero.

cmarmo · 2021-01-11T10:01:36Z

Thanks @stootoon! Some lint errors need to be fixed.

Running flake8 on the diff in the range 1e46db669..21be8a215 (3 commit(s)):
--------------------------------------------------------------------------------
sklearn/linear_model/_base.py:289:1: W293 blank line contains whitespace
        
^
sklearn/linear_model/_base.py:290:56: W291 trailing whitespace
        coef_norms = np.linalg.norm(self.coef_, axis=1)        
                                                       ^
sklearn/linear_model/_base.py:291:15: E221 multiple spaces before operator
        scores    /= coef_norms
              ^
sklearn/linear_model/_base.py:292:1: W293 blank line contains whitespace
        
^

Exited with code exit status 1

CircleCI received exit code 1

…sion_function

NicolasHug · 2021-01-11T10:35:25Z

Thanks for the PR @stootoon , but we can't change the output of decision_function #19139 (comment). Instead it would be better to just change the docstring, as suggested in the original issue.

…erplanes.

NicolasHug

Thanks!

stootoon added 2 commits January 10, 2021 10:46

Use unit-norm coefs to compute scores.

7f2dd1b

Intercepts must be normalized as well.

5423c45

github-actions bot added the module:linear_model label Jan 10, 2021

stootoon added 3 commits January 11, 2021 10:14

Removed whitespace causing linting issues.

15b3be1

Removed more whitespace.

d7dee06

Merge remote-tracking branch 'upstream/master' into linear_model_deci…

66afe00

…sion_function

Updated docstring to correct relationship of decision_function to hyp…

9b613ed

…erplanes.

agramfort approved these changes Jan 12, 2021

View reviewed changes

NicolasHug approved these changes Jan 13, 2021

View reviewed changes

NicolasHug merged commit 9f86a25 into scikit-learn:master Jan 13, 2021

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Jan 18, 2021

DOC Normalization of linear_model decision_function (scikit-learn#19142)

7baf758

jeremiedbb pushed a commit that referenced this pull request Jan 19, 2021

DOC Normalization of linear_model decision_function (#19142)

b188f94

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Normalize linear_model decision_function scores. #19142

[MRG] Normalize linear_model decision_function scores. #19142

Uh oh!

stootoon commented Jan 10, 2021

Uh oh!

cmarmo commented Jan 11, 2021

Uh oh!

NicolasHug commented Jan 11, 2021

Uh oh!

NicolasHug left a comment

Uh oh!

Uh oh!

Uh oh!

[MRG] Normalize linear_model decision_function scores. #19142

[MRG] Normalize linear_model decision_function scores. #19142

Uh oh!

Conversation

stootoon commented Jan 10, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

cmarmo commented Jan 11, 2021

Uh oh!

NicolasHug commented Jan 11, 2021

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!