[MRG] ENH Add `trust-ncg` option to LogisticRegression #17877

rithvikrao · 2020-07-09T22:14:14Z

Co-authored-by: Ruby Werman rubywerman@berkeley.edu

Reference Issues/PRs

Fixes #17125

What does this implement/fix? Explain your changes.

Implements trust-ncg option in LogisticRegression when multi_class = 'multinomial'.

Any other comments?

Co-authored-by: Ruby Werman <rubywerman@berkeley.edu>

thomasjpfan

During development, it would be good to benchmark these solvers comparing it with the methods already supported.

flosincapite · 2020-07-17T18:23:55Z

Hi, @thomasjpfan ! I see this PR has been languishing for a bit. Is there something else Rithvik and Ruby need to do to make this PR ready to merge?

thomasjpfan · 2020-07-17T19:16:36Z

This needs benchmarks to compare the new solvers with the current solvers.

thomasjpfan · 2020-07-17T19:18:25Z

Removing the Draft would help with getting reviewers attention and maybe only adding one solver would be better. This way you just need to benchmark one of the solver for this PR.

The test are failing as well.

rubywerman · 2020-07-22T20:55:31Z

Hi @thomasjpfan! I wrote the following benchmark to examine memory usage and was wondering if you could take a look:
The notebook is here, but I pasted the code below as well:

from sklearn.linear_model import LogisticRegression   
import numpy as np
X, y = fetch_20newsgroups_vectorized(return_X_y=True)                                                                         
%load_ext memory_profiler
%memit LogisticRegression(multi_class='multinomial', solver="lbfgs").fit(X, y)
%memit LogisticRegression(multi_class='multinomial', solver="trust-ncg").fit(X, y)
%memit LogisticRegression(multi_class='multinomial', solver="trust-krylov").fit(X, y)

The outputs of the %memit lines were as followed:
lfbgs: peak memory: 1132.86 MiB, increment: 896.89 MiB
trust-ncg: peak memory: 458.23 MiB, increment: 55.56 MiB
trust-krylov: peak memory: 478.12 MiB, increment: 101.42 MiB

rubywerman · 2020-07-24T17:43:29Z

I removed trust-krylov in the meantime and now we pass all the tests

thomasjpfan

Thank you for the PR @rithvikrao !

sklearn/linear_model/_logistic.py

rubywerman · 2020-07-29T15:27:49Z

Hi @thomasjpfan! Thank you for the feedback, I added in your suggestions. However, I'm confused about changing the function signature for hessp– what are x and p in your suggestion, def hessp(x, p, *args):? We wrote the hessp function going off this description in the docstring of _multinomial_grad_hess:

hessp : callable Function that takes in a vector input of shape (n_classes * n_features) or (n_classes * (n_features + 1)) and returns matrix-vector product with hessian.

sklearn/linear_model/_logistic.py

thomasjpfan

The user guide needs to be updated with this new solver and describe when this would be useful.

glemaitre · 2020-08-21T17:03:10Z

sklearn/linear_model/_logistic.py

+    all_solvers = ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga',
+                   'trust-ncg']


Suggested change

all_solvers = ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga',

'trust-ncg']

all_solvers = [

'liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga', 'trust-ncg'

]

glemaitre · 2020-08-21T17:18:14Z

Could you update the test as well to make sure that we test for this new solver:

test_predict_iris: we could add the solver for this test;
test_multinomial_validation: we should check that we fail consistently;
test_multinomial_binary: I suppose that we want to try the solver with this configuration as well;
test_consistency_path: I suppose that we want to check that everything goes fine when fit_intercept=True;
test_ovr_multinomial_iris: check that the new solver is as well giving better result than ovr;
test_logistic_regressioncv_class_weights, test_logistic_regression_sample_weights, test_logistic_regression_class_weights, test_logreg_predict_proba_multinomial, test_n_iter, test_warm_start: : we should test the solver there;

lorentzenchr · 2021-03-14T15:11:16Z

@rithvikrao @rubywerman Are you sill working on this?
I'd be interested in:

additional benchmarks of fit time
other trust-region optimizers, i.e. method in ["trust-constr", "trust-ncg", "trust-krylov", "trust-exact"]

I did some experiments in the past, have a look at scipy/scipy#12275.

rubywerman · 2021-03-15T05:03:09Z

@lorentzenchr I am not! I don't think rithvik is either

lorentzenchr · 2021-03-16T21:07:16Z

@rubywerman Thanks for the quick reply.

lorentzenchr · 2021-03-16T21:09:04Z

@thomasjpfan @glemaitre Shall we close or is it better to keep it open. In case someone takes over, a new PR is usually the cleanest way, anyway.

thomasjpfan · 2021-03-17T14:52:27Z

I think the current workflow is to close the PR when there is another PR that supersedes it. I think "Stalled" + open means we are still interested in the PR and a new contributor can ask the original contributor if they can work on it. "Stalled" + closed would mean we are not interested in this feature or the "PR is too old and it would be better to start fresh".

At a glance, I think this still need benchmarks, so we can advise users on "how to decide on solver" especially since we already have so many solvers. From #17877 (comment) it looks like there is a memory benefit in the sparse case. From the benchmarks by @lorentzenchr at scipy/scipy#12275, trust-ncg with the hessian converges but slower compared to lbfgs in the dense case.

ogrisel · 2021-03-17T15:05:05Z

In particular we need benchmarks that measure and compare the CPU execution time, n_iter_ and final negative likelihood (log loss) for several solvers and not just memory usage benchmarks as already reported above. It useful to compare memory usage but not enough.

Micky774 · 2022-01-17T18:49:21Z

I can try picking this up and adding some benchmarks

lorentzenchr · 2022-06-22T19:38:07Z

See conclusion in #22236.

Initial commit

70ecf59

Co-authored-by: Ruby Werman <rubywerman@berkeley.edu>

github-actions bot added module:linear_model module:utils labels Jul 9, 2020

Remove extraneous print statement

1ad6650

rubywerman force-pushed the logistic branch 2 times, most recently from c0da745 to c6d049a Compare July 10, 2020 17:27

rithvikrao force-pushed the logistic branch from c6d049a to 1ad6650 Compare July 10, 2020 18:53

thomasjpfan reviewed Jul 12, 2020

View reviewed changes

rithvikrao changed the title ~~Add trust-ncg and trust-krylov options to LogisticRegression~~ [WIP] Add trust-ncg and trust-krylov options to LogisticRegression Jul 23, 2020

rithvikrao marked this pull request as ready for review July 23, 2020 22:09

rithvikrao and others added 4 commits July 23, 2020 15:13

Merge branch 'master' into logistic

ba513a0

remove trust-krylov solver

03381b3

remove trust-krylov solver

9a0e192

revert changes on this file, not finished with it yet

a9e6208

thomasjpfan reviewed Jul 27, 2020

View reviewed changes

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

rubywerman added 2 commits July 27, 2020 23:28

add hess suggestions

0559dd1

change hessp parameter

ea103e1

rithvikrao changed the title ~~[WIP] Add trust-ncg and trust-krylov options to LogisticRegression~~ [MRG] Add trust-ncg and trust-krylov options to LogisticRegression Jul 30, 2020

thomasjpfan reviewed Aug 6, 2020

View reviewed changes

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

change hessp signature

6b27b1d

thomasjpfan reviewed Aug 7, 2020

View reviewed changes

rithvikrao changed the title ~~[MRG] Add trust-ncg and trust-krylov options to LogisticRegression~~ [MRG] Add trust-ncg option to LogisticRegression Aug 7, 2020

rubywerman added 2 commits August 17, 2020 13:20

add trust-ncg to table

b4a016d

add use case for trust-ncg

8702ed7

glemaitre reviewed Aug 21, 2020

View reviewed changes

glemaitre changed the title ~~[MRG] Add trust-ncg option to LogisticRegression~~ [MRG] ENH Add trust-ncg option to LogisticRegression Aug 21, 2020

Base automatically changed from master to main January 22, 2021 10:52

lorentzenchr added the Stalled label Mar 16, 2021

lorentzenchr added help wanted and removed help wanted labels Mar 19, 2021

lorentzenchr mentioned this pull request Mar 19, 2021

LogisticRegression memory consumption goes crazy on 0.22+ #17125

Closed

Micky774 mentioned this pull request Jan 17, 2022

ENH Add trust-ncg solver to LogisticRegression #22236

Closed

thomasjpfan added the Superseded PR has been replace by a newer PR label Feb 2, 2022

cmarmo removed the Stalled label May 10, 2022

lorentzenchr closed this Jun 22, 2022

		all_solvers = ['liblinear', 'newton-cg', 'lbfgs', 'sag', 'saga',
		'trust-ncg']

Uh oh!

[MRG] ENH Add trust-ncg option to LogisticRegression #17877

[MRG] ENH Add trust-ncg option to LogisticRegression #17877

Uh oh!

Conversation

rithvikrao commented Jul 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

flosincapite commented Jul 17, 2020

Uh oh!

thomasjpfan commented Jul 17, 2020

Uh oh!

thomasjpfan commented Jul 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rubywerman commented Jul 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rubywerman commented Jul 24, 2020

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rubywerman commented Jul 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre Aug 21, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Aug 21, 2020

Uh oh!

lorentzenchr commented Mar 14, 2021

Uh oh!

rubywerman commented Mar 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Mar 16, 2021

Uh oh!

lorentzenchr commented Mar 16, 2021

Uh oh!

thomasjpfan commented Mar 17, 2021

Uh oh!

ogrisel commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Micky774 commented Jan 17, 2022

Uh oh!

lorentzenchr commented Jun 22, 2022

Uh oh!

Uh oh!

[MRG] ENH Add `trust-ncg` option to LogisticRegression #17877

[MRG] ENH Add `trust-ncg` option to LogisticRegression #17877

rithvikrao commented Jul 9, 2020 •

edited

Loading

thomasjpfan commented Jul 17, 2020 •

edited

Loading

rubywerman commented Jul 22, 2020 •

edited

Loading

rubywerman commented Jul 29, 2020 •

edited

Loading

rubywerman commented Mar 15, 2021 •

edited

Loading

ogrisel commented Mar 17, 2021 •

edited

Loading