Skip to content

[MRG] Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression #10001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

thechargedneutron
Copy link
Contributor

Reference Issues/PRs

Fixes #9997

What does this implement/fix? Explain your changes.

Adds FutureWarning that default of solver and multi_class will be changed to 'lbfgs' and 'multinomial' respectively.

Any other comments?

@thechargedneutron
Copy link
Contributor Author

@jnothman Is this PR in the right direction? I am not sure if this is what you meant by using 'auto'.

@thechargedneutron thechargedneutron changed the title Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression [WIP] Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression Oct 31, 2017
@@ -1198,6 +1198,14 @@ def fit(self, X, y, sample_weight=None):
self : object
Returns self.
"""
if self.solver == 'auto':
self.solver = 'liblinear'
warnings.warn("Default solver will be changed to 'lbfgs' in 0.22",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for L2 penalty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of default say 'auto'

But you also now give the user no easy way to silence the warning. We will need a temporary setting like 'default' to manage the warning.

@@ -1198,6 +1198,14 @@ def fit(self, X, y, sample_weight=None):
self : object
Returns self.
"""
if self.solver == 'auto':
self.solver = 'liblinear'
warnings.warn("Default solver will be changed to 'lbfgs' in 0.22",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of default say 'auto'

But you also now give the user no easy way to silence the warning. We will need a temporary setting like 'default' to manage the warning.

FutureWarning)
if self.multi_class == 'auto':
self.multi_class = 'ovr'
warnings.warn("Default multi_class will be changed to "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only when solver is not liblinear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when solver is indeed liblinear, multi_class remains as auto and thus raises an error. Am I missing out something here?

@thechargedneutron
Copy link
Contributor Author

@jnothman Really Sorry, I forget that I had an open PR here, all because of my semester examinations. I somehow feel that the search option in github's open PR does not give full list :P

@jnothman
Copy link
Member

jnothman commented Dec 5, 2017

https://github.com/scikit-learn/scikit-learn/pulls/thechargedneutron should help.

You're getting an error in CI:

/home/travis/build/scikit-learn/scikit-learn/sklearn/linear_model/tests/test_logistic.py:245: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/travis/build/scikit-learn/scikit-learn/sklearn/linear_model/logistic.py:1231: in fit
    self.dual)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
solver = 'liblinear', multi_class = 'auto', penalty = 'l2', dual = False
    def _check_solver_option(solver, multi_class, penalty, dual):
        if solver not in ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga']:
            raise ValueError("Logistic Regression supports only liblinear, "
                             "newton-cg, lbfgs, sag and saga solvers, got %s"
                             % solver)
    
        if multi_class not in ['multinomial', 'ovr']:
            raise ValueError("multi_class should be either multinomial or "
>                            "ovr, got %s" % multi_class)
E           ValueError: multi_class should be either multinomial or ovr, got auto

@thechargedneutron
Copy link
Contributor Author

@jnothman Do you think we need to convert auto into the original parameter in the fit? If that's the case, _check_solver_option fails otherwise. So, do I need to copy paste the same thing there also or there's some better way to do it?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again if you wrote tests or at least documented correct behaviour first, with a premise that 'auto' must always work, you would have an easier time getting the logic right, and a green tick to say you had done so

@@ -1198,6 +1202,14 @@ def fit(self, X, y, sample_weight=None):
self : object
Returns self.
"""
if self.solver == 'auto' and self.penalty == 'l2':
self.solver = 'liblinear'
Copy link
Member

@jnothman jnothman Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't allow fit to change constructor parameters. You can store the validated parameter as _solver if it should be private, or solver_ if it should be public (assumed useful to users).

@thechargedneutron
Copy link
Contributor Author

@jnothman I guess this is a better implementation and indeed passes tests (except probably a doctest failure).

@thechargedneutron
Copy link
Contributor Author

Pulled the latest version to resolve conflicts.

@thechargedneutron thechargedneutron changed the title [WIP] Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression [MRG] Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression Jan 8, 2018
@TomDLT
Copy link
Member

TomDLT commented Jan 25, 2018

test_logistic_regression_warnings and test_logistic_regression_cv_warnings could also be merged into one test to avoid duplication, using a loop or with pytest:

@pytest.mark.parametrize('model', [LogisticRegression, LogisticRegressionCV])
def test_logistic_regression_warnings(model):
    ...

Same for test_logistic_regression_auto and test_logistic_regression_cv_auto

@thechargedneutron
Copy link
Contributor Author

@TomDLT Parametrize added.

Can you refactor it, for instance putting this part in _check_solver_option?

That would require me to have a return parameter in _check_solver_option which returns the solver according to the penalty. I think the current implementation is better than returning all those from the check function. I would change if you think returning is a better option.

@TomDLT
Copy link
Member

TomDLT commented Jan 29, 2018

I think the current implementation is better than returning all those from the check function.

Why?

But if you prefer you can also merge them into a new function. I just find it redundant and prone to future mistakes.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really need to improve the docs here. I would consider documenting what 'auto' means to be something you'd do before labelling the PR as MRG.

@@ -500,6 +500,7 @@ def logistic_regression_path(X, y, pos_class=None, Cs=10, fit_intercept=True,
number for verbosity.

solver : {'lbfgs', 'newton-cg', 'liblinear', 'sag', 'saga'}
default: 'default'. Will be changed to 'auto' solver in 0.22.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default 'default' doesn't mean anything. If you're going to say such a thing, you need to explain what it means. Better to say it is lbfgs by default, even if the string 'default' is used as a placeholder. 'auto' needs to be added to the list of options and described.

@@ -587,6 +589,19 @@ def logistic_regression_path(X, y, pos_class=None, Cs=10, fit_intercept=True,
.. versionchanged:: 0.19
The "copy" parameter was removed.
"""
if solver == 'default':
solver = 'lbfgs'
warnings.warn("Default solver will be changed from 'lbfgs' "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably only warn if auto would choose a solver other than lbfgs.

@@ -1606,7 +1660,24 @@ def fit(self, X, y, sample_weight=None):
-------
self : object
"""
_check_solver_option(self.solver, self.multi_class, self.penalty,
if self.solver == 'default':
_solver = 'lbfgs'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it strange to have these preceding undescores. This is just a plain old local variable.

@jnothman
Copy link
Member

jnothman commented Mar 5, 2018 via email

@jnothman
Copy link
Member

I'm marking this as stalled / help wanted. But maybe we need a core dev to take it on, as it really is bad that we have ovr and intercept regularisation as default.

@amueller
Copy link
Member

amueller commented Jun 2, 2018

Do we really want L-BFGS as default or SAG or depend on the data to decide? I think if we change the default we should change to a solver that's automatic, as we did for PCA, and that is picked according to the penalty.

@agramfort
Copy link
Member

agramfort commented Jun 2, 2018 via email

@jnothman
Copy link
Member

jnothman commented Jun 3, 2018

So can someone write down the decision tree of settings we're working towards? SAG and L-BFGS should at least reach the same coefficients given reasonable amounts of data?

I think this should be added as a priority for 0.20.

@jnothman jnothman added this to the 0.20 milestone Jun 3, 2018
@jnothman
Copy link
Member

jnothman commented Jun 7, 2018

Another question related to this: should LogisticRegressionCV(solver='liblinear') be optimising intercept_scaling as well as C?

@jnothman
Copy link
Member

@TomDLT, given your previous work in this space, would you mind working on defining appropriate defaults? If fit_intercept=False and (the data is binary or multiclass='ovr'), should we still be defaulting to liblinear??

I also wonder if we can say that for something named 'auto' we give no strong assurances of API stability across versions: we make a weak assurance that with sufficient (i.e. infinite) max_iter, 'auto' should result in convergence to the same solution (but not necessarily the same n_iter_) across versions.

@TomDLT
Copy link
Member

TomDLT commented Jun 15, 2018

I'll do it

@amueller
Copy link
Member

Hm SAG is pretty unstable if you don't scale the data. We could always pre-scale the data but that seems a bit un-scikit-learn.

@amueller
Copy link
Member

@TomDLT are you still on this? Otherwise @NicolasHug will take it up. Also: are you at scipy?

@amueller
Copy link
Member

So do we just want to change the default of solver as a first step or do we want to do an automagic "auto"? It's really not clear to me how auto would really work, since good values for tol and max_iter and dual depend on the solver, right? If someone sets "dual=True" does that mean they'll always get liblinear? Or do we ignore that unless they set the solver to liblinear?

On the one hand it would be annoying to make a backward-incompatible change twice, on the other hand, getting "auto" right seems very hairy and probably not doable for 0.20, even without benchmarks.

If we want to run benchmarks, we would need to benchmark at least these cases, right?
Tall data (small n_features, large n_samples)
wide data (small n_samples, large n_features)
sparse wide data
multiclass = ovr or multinomial or binary
fit_intercept = True/False

That would be 18 possible regimes already. Well maybe the decision tree for the optimum solver doesn't involve all combinations.

SAG and L-BFGS should at least reach the same coefficients given reasonable amounts of data?

If the data is scaled badly that's probably not gonna happen, I think?

@jnothman
Copy link
Member

jnothman commented Jul 11, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change LogisticRegression default solver to lbfgs and multiclass to multinomial?
5 participants