Skip to content

[MRG] ENH Add trust-ncg option to LogisticRegression #17877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

rithvikrao
Copy link
Contributor

@rithvikrao rithvikrao commented Jul 9, 2020

Co-authored-by: Ruby Werman rubywerman@berkeley.edu

Reference Issues/PRs

Fixes #17125

What does this implement/fix? Explain your changes.

Implements trust-ncg option in LogisticRegression when multi_class = 'multinomial'.

Any other comments?

Co-authored-by: Ruby Werman <rubywerman@berkeley.edu>
@rubywerman rubywerman force-pushed the logistic branch 2 times, most recently from c0da745 to c6d049a Compare July 10, 2020 17:27
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During development, it would be good to benchmark these solvers comparing it with the methods already supported.

@flosincapite
Copy link

Hi, @thomasjpfan ! I see this PR has been languishing for a bit. Is there something else Rithvik and Ruby need to do to make this PR ready to merge?

@thomasjpfan
Copy link
Member

This needs benchmarks to compare the new solvers with the current solvers.

@thomasjpfan
Copy link
Member

thomasjpfan commented Jul 17, 2020

Removing the Draft would help with getting reviewers attention and maybe only adding one solver would be better. This way you just need to benchmark one of the solver for this PR.

The test are failing as well.

@rubywerman
Copy link
Contributor

rubywerman commented Jul 22, 2020

Hi @thomasjpfan! I wrote the following benchmark to examine memory usage and was wondering if you could take a look:
The notebook is here, but I pasted the code below as well:

from sklearn.linear_model import LogisticRegression   
import numpy as np
X, y = fetch_20newsgroups_vectorized(return_X_y=True)                                                                         
%load_ext memory_profiler
%memit LogisticRegression(multi_class='multinomial', solver="lbfgs").fit(X, y)
%memit LogisticRegression(multi_class='multinomial', solver="trust-ncg").fit(X, y)
%memit LogisticRegression(multi_class='multinomial', solver="trust-krylov").fit(X, y)

The outputs of the %memit lines were as followed:
lfbgs: peak memory: 1132.86 MiB, increment: 896.89 MiB
trust-ncg: peak memory: 458.23 MiB, increment: 55.56 MiB
trust-krylov: peak memory: 478.12 MiB, increment: 101.42 MiB

@rithvikrao rithvikrao changed the title Add trust-ncg and trust-krylov options to LogisticRegression [WIP] Add trust-ncg and trust-krylov options to LogisticRegression Jul 23, 2020
@rithvikrao rithvikrao marked this pull request as ready for review July 23, 2020 22:09
@rubywerman
Copy link
Contributor

I removed trust-krylov in the meantime and now we pass all the tests

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @rithvikrao !

@rubywerman
Copy link
Contributor

rubywerman commented Jul 29, 2020

Hi @thomasjpfan! Thank you for the feedback, I added in your suggestions. However, I'm confused about changing the function signature for hessp– what are x and p in your suggestion, def hessp(x, p, *args):? We wrote the hessp function going off this description in the docstring of _multinomial_grad_hess:

hessp : callable Function that takes in a vector input of shape (n_classes * n_features) or (n_classes * (n_features + 1)) and returns matrix-vector product with hessian.

@rithvikrao rithvikrao changed the title [WIP] Add trust-ncg and trust-krylov options to LogisticRegression [MRG] Add trust-ncg and trust-krylov options to LogisticRegression Jul 30, 2020
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user guide needs to be updated with this new solver and describe when this would be useful.

@rithvikrao rithvikrao changed the title [MRG] Add trust-ncg and trust-krylov options to LogisticRegression [MRG] Add trust-ncg option to LogisticRegression Aug 7, 2020
Comment on lines +431 to +432
all_solvers = ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga',
'trust-ncg']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
all_solvers = ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga',
'trust-ncg']
all_solvers = [
'liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga', 'trust-ncg'
]

@glemaitre
Copy link
Member

Could you update the test as well to make sure that we test for this new solver:

  • test_predict_iris: we could add the solver for this test;
  • test_multinomial_validation: we should check that we fail consistently;
  • test_multinomial_binary: I suppose that we want to try the solver with this configuration as well;
  • test_consistency_path: I suppose that we want to check that everything goes fine when fit_intercept=True;
  • test_ovr_multinomial_iris: check that the new solver is as well giving better result than ovr;
  • test_logistic_regressioncv_class_weights, test_logistic_regression_sample_weights, test_logistic_regression_class_weights, test_logreg_predict_proba_multinomial, test_n_iter, test_warm_start: : we should test the solver there;

@glemaitre glemaitre changed the title [MRG] Add trust-ncg option to LogisticRegression [MRG] ENH Add trust-ncg option to LogisticRegression Aug 21, 2020
Base automatically changed from master to main January 22, 2021 10:52
@lorentzenchr
Copy link
Member

@rithvikrao @rubywerman Are you sill working on this?
I'd be interested in:

  • additional benchmarks of fit time
  • other trust-region optimizers, i.e. method in ["trust-constr", "trust-ncg", "trust-krylov", "trust-exact"]

I did some experiments in the past, have a look at scipy/scipy#12275.

@rubywerman
Copy link
Contributor

rubywerman commented Mar 15, 2021

@lorentzenchr I am not! I don't think rithvik is either

@lorentzenchr
Copy link
Member

@rubywerman Thanks for the quick reply.

@lorentzenchr
Copy link
Member

@thomasjpfan @glemaitre Shall we close or is it better to keep it open. In case someone takes over, a new PR is usually the cleanest way, anyway.

@thomasjpfan
Copy link
Member

I think the current workflow is to close the PR when there is another PR that supersedes it. I think "Stalled" + open means we are still interested in the PR and a new contributor can ask the original contributor if they can work on it. "Stalled" + closed would mean we are not interested in this feature or the "PR is too old and it would be better to start fresh".

At a glance, I think this still need benchmarks, so we can advise users on "how to decide on solver" especially since we already have so many solvers. From #17877 (comment) it looks like there is a memory benefit in the sparse case. From the benchmarks by @lorentzenchr at scipy/scipy#12275, trust-ncg with the hessian converges but slower compared to lbfgs in the dense case.

@ogrisel
Copy link
Member

ogrisel commented Mar 17, 2021

In particular we need benchmarks that measure and compare the CPU execution time, n_iter_ and final negative likelihood (log loss) for several solvers and not just memory usage benchmarks as already reported above. It useful to compare memory usage but not enough.

@Micky774
Copy link
Contributor

I can try picking this up and adding some benchmarks

@thomasjpfan thomasjpfan added the Superseded PR has been replace by a newer PR label Feb 2, 2022
@cmarmo cmarmo removed the Stalled label May 10, 2022
@lorentzenchr
Copy link
Member

See conclusion in #22236.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:linear_model module:utils Superseded PR has been replace by a newer PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LogisticRegression memory consumption goes crazy on 0.22+
9 participants