Deprecations 0.21 sgd tol #12239

amueller · 2018-10-01T21:49:19Z

Working on #11992, deprecating the change from n_iter to max_iter/tol in the SGD family

sklearn/linear_model/passive_aggressive.py

sklearn/linear_model/stochastic_gradient.py

amueller · 2018-10-01T21:56:54Z

sklearn/linear_model/stochastic_gradient.py

@@ -1556,12 +1499,11 @@ class SGDRegressor(BaseSGDRegressor):

    """
    def __init__(self, loss="squared_loss", penalty="l2", alpha=0.0001,
-                 l1_ratio=0.15, fit_intercept=True, max_iter=None, tol=None,
+                 l1_ratio=0.15, fit_intercept=True, max_iter=None, tol=1e-3,


missing max_iter here.

sklearn-lgtm · 2018-10-02T17:06:56Z

This pull request introduces 1 alert when merging b4fc3c3 into dfd009d - view on LGTM.com

new alerts:

1 for Unused import

Comment posted by LGTM.com

amueller · 2018-10-02T18:59:54Z

So it looks like setting tol=0 is not actually preserving previous behavior, setting tol=-inf would. Setting tol to any negative number that's not inf is currently supported but seems very counter-intuitive. Should we disallow that (maybe worth a separate issue)?

amueller · 2018-10-02T19:04:24Z

also if a user used tol=None for some reason, they wouldn't get a warning in the past but now it breaks.... hm...

amueller · 2018-10-02T19:06:06Z

sklearn/linear_model/stochastic_gradient.py

@@ -1290,8 +1242,6 @@ def _fit_regressor(self, X, y, alpha, C, loss, learning_rate,
        # Windows
        seed = random_state.randint(0, np.iinfo(np.int32).max)

-        tol = self._tol if self._tol is not None else -np.inf


maybe this line should stay here, making None an alias for -np.inf?

I agree.
In any case, we have to be consistent, either keeping this line, or removing the one in fit_binary:

scikit-learn/sklearn/linear_model/stochastic_gradient.py

Line 389 in 92a50e5

tol = est.tol if est.tol is not None else -np.inf

amueller · 2018-10-02T19:19:53Z

sklearn/linear_model/passive_aggressive.py

                  validation_fraction=0.1, verbose=0, warm_start=False)
    >>> print(clf.coef_)
-    [[0.29509834 0.33711843 0.56127352 0.60105546]]
+    [[-0.6543424   1.54603022  1.35361642  0.22199435]]


This is a really bad sign! I think we might have messed this up somehow.
I thought setting max_iter here meant we automatically set tol to 0.001 but apparently not?

amueller · 2018-10-02T19:22:48Z

Ok so if someone set max_iter but not tol in 0.20, they got tol=None which is equivalent to tol=-np.inf and they didn't get a deprecation warning. In 0.21 they will get a changed behavior, tol=0.001. thoughts?

jnothman · 2018-10-03T00:56:41Z

told=None was only valid during deprecation, so it's not a problem

TomDLT · 2018-10-03T09:50:28Z

But a user getting a DeprecationWarning for n_iter could silence it by setting max_iter instead. This would not trigger any warning about not setting tol, and was equivalent to tol=-np.inf. Now, changing the default to tol=0.001 would break their code.

The wise decision would probably be to remove warnings about n_iter and max_iter, but to do a new deprecation cycle for tol.

amueller · 2018-10-03T17:09:50Z

@jnothman can you elaborate. I think I'm with @TomDLT on this one though that would be a real pain :-/

jnothman · 2018-10-04T08:04:59Z

I now see what you mean. I had skimmed. We had told them it was doing something different to what it actually was if they set max_iter and not tol. But we had documented that from 0.21 the default would be 1e-3. We had not clearly specified what would happen if max_iter was specified but tol not.

So I see a choice of:

conservatively, follow backwards compatibility: adopt @TomDLT's approach, which would issue a FutureWarning when max_iter is explicitly set and tol is not.
progressively, follow the documentation: follow what we had documented, but perhaps issue a ChangedBehaviorWarning if max_iter is set and tol is not.

(Is there a way to systematically avoid this in the future, I wonder?)

amueller · 2018-10-04T14:54:02Z

A similar issue arises in #12240 btw.

I have no idea how to systematically test for this. The test we want is to assure that people that got no warning will not see changed behavior. But the new behavior is sometimes not even implemented when we raise the warning.

jnothman · 2018-10-05T04:08:07Z

Well it would require implementing the new behaviour explicitly when deprecating, and spoofing the default values at completion of deprecation.

amueller · 2018-10-05T15:24:58Z

Yeah. I'm not sure how easy that is in general. We can try?

jnothman · 2018-10-06T23:28:16Z

Probably hard in general... Maybe there's an elegant way to do it? Just something to consider.

amueller · 2018-10-10T19:51:44Z

Should we get some sort of warning into 0.20.1? I guess depends on which way we go?

jnothman · 2018-10-11T07:05:22Z

Sure, a FutureWarning in 0.20.1 wouldn't hurt...

amueller · 2018-10-16T16:00:10Z

We still haven't made a decision here, right? And we also need to make a decision for #12240.
I don't really have a strong opinion. @rth @ogrisel ?

jnothman · 2018-10-17T02:47:11Z

I'm happy with the progressive option: follow what we had documented, but perhaps issue a ChangedBehaviorWarning if max_iter is set and tol is not.

I am okay to issue a FutureWarning in 0.20.1

amueller · 2018-10-17T02:55:37Z

ok. @NicolasHug can you maybe do that?

amueller · 2018-12-14T15:50:22Z

closing as we need to wait longer now

amueller added 2 commits October 1, 2018 17:41

SGD n_iter/max_iter/tol deprecation

9a1baed

more n_iter/tol deprecations

5a0e033

amueller commented Oct 1, 2018

View reviewed changes

sklearn/linear_model/passive_aggressive.py Show resolved Hide resolved

amueller commented Oct 1, 2018

View reviewed changes

sklearn/linear_model/stochastic_gradient.py Show resolved Hide resolved

amueller commented Oct 1, 2018

View reviewed changes

This comment has been minimized.

Sign in to view

amueller changed the title ~~Deprecations 21 sgd tol~~ Deprecations 0.21 sgd tol Oct 2, 2018

more tol/max_iter fixes

b4fc3c3

more SGD fixes

37bf3ec

amueller mentioned this pull request Oct 2, 2018

MNT simple deprecations and removals for 0.21 #12238

Merged

use tol=-np.inf in tests?

c644214

amueller commented Oct 2, 2018

View reviewed changes

amueller added 2 commits October 2, 2018 15:06

replace none by -np.inf

6839d80

fixing some doctests. This seems fishy AF

92a50e5

amueller commented Oct 2, 2018

View reviewed changes

amueller added this to the 0.20.1 milestone Oct 11, 2018

NicolasHug mentioned this pull request Oct 17, 2018

[MRG] Added FutureWarning in sgd models for tol parameter #12399

Merged

amueller modified the milestones: 0.20.1, 0.21 Oct 24, 2018

amueller closed this Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecations 0.21 sgd tol #12239

Deprecations 0.21 sgd tol #12239

amueller commented Oct 1, 2018

amueller Oct 1, 2018

This comment has been minimized.

sklearn-lgtm commented Oct 2, 2018

amueller commented Oct 2, 2018

amueller commented Oct 2, 2018

amueller Oct 2, 2018

TomDLT Oct 3, 2018

amueller Oct 2, 2018

amueller commented Oct 2, 2018 •

edited

Loading

jnothman commented Oct 3, 2018

TomDLT commented Oct 3, 2018 •

edited

Loading

amueller commented Oct 3, 2018

jnothman commented Oct 4, 2018

amueller commented Oct 4, 2018

jnothman commented Oct 5, 2018 via email

amueller commented Oct 5, 2018

jnothman commented Oct 6, 2018 via email

amueller commented Oct 10, 2018

jnothman commented Oct 11, 2018

amueller commented Oct 16, 2018

jnothman commented Oct 17, 2018

amueller commented Oct 17, 2018

amueller commented Dec 14, 2018

Deprecations 0.21 sgd tol #12239

Deprecations 0.21 sgd tol #12239

Conversation

amueller commented Oct 1, 2018

amueller Oct 1, 2018

Choose a reason for hiding this comment

This comment has been minimized.

sklearn-lgtm commented Oct 2, 2018

amueller commented Oct 2, 2018

amueller commented Oct 2, 2018

amueller Oct 2, 2018

Choose a reason for hiding this comment

TomDLT Oct 3, 2018

Choose a reason for hiding this comment

amueller Oct 2, 2018

Choose a reason for hiding this comment

amueller commented Oct 2, 2018 • edited Loading

jnothman commented Oct 3, 2018

TomDLT commented Oct 3, 2018 • edited Loading

amueller commented Oct 3, 2018

jnothman commented Oct 4, 2018

amueller commented Oct 4, 2018

jnothman commented Oct 5, 2018 via email

amueller commented Oct 5, 2018

jnothman commented Oct 6, 2018 via email

amueller commented Oct 10, 2018

jnothman commented Oct 11, 2018

amueller commented Oct 16, 2018

jnothman commented Oct 17, 2018

amueller commented Oct 17, 2018

amueller commented Dec 14, 2018

amueller commented Oct 2, 2018 •

edited

Loading

TomDLT commented Oct 3, 2018 •

edited

Loading