FIX properly report `n_iter_` in case of fallback from Newton-Cholesky to LBFGS #30100

ogrisel · 2024-10-18T12:10:28Z

Fix for a small bug discovered while reviewing #28840: whenever the Newton-Cholesky solver of LogisticRegression would fall back to LFGS, the reported n_iter_ attribute would be left to zero.

I made it such that any completed iteration from the Newton-Cholesky solver would be subtracted from max_iter before calling LBFGS, and then report the sum of the numbers of iterations completed by the two solvers in the end. In practice, this does not seem to change anything because when Hessian conditioning problem always happens during the first iteration in my experiments with low regularized, rank deficient problems that typically trigger the LBFGS fallback mechanism.

Note that I find that the LBFGS fallback warning quite annoying whenever it is triggered while tuning the regularization level (e.g. using LogisticRegressionCV or RandomizedSearchCV). I have the feeling that this should be a regular verbose print instead, but we can tackle that in a separate PR.

github-actions · 2024-10-18T12:12:05Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: e0ec69a. Link to the linter CI: here}

lorentzenchr

Thanks for spotting and fixing this bug.
I also would like to see improvements for the fallback case.

sklearn/linear_model/tests/test_logistic.py

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>

test_newton_cholesky_fallback_to_lbfgs

ogrisel · 2024-10-18T15:38:56Z

The global_random_seed parametrized test pass for all seeds locally.

Unfortunately this is not the case on our macOS pylatest_conda_mkl_no_openmp runner, even after having decreased the regularization and increased the data width.

There is a failure with global_random_seed=6 on that host.

        # Trying to fit the same model again with a small iteration budget should
        # therefore raise a ConvergenceWarning:
        lr_nc_limited = LogisticRegression(
            solver="newton-cholesky", C=C, max_iter=n_iter_lbgs - 1
        )
>       with pytest.warns(LinAlgWarning, match="ill-conditioned Hessian matrix"):
E       Failed: DID NOT WARN. No warnings of type (<class 'scipy.linalg._misc.LinAlgWarning'>,) were emitted.
E       The list of emitted warnings is: []

I pushed a new commit with the [all random seeds] commit message for that test to see there is only a problem with this CI environment or if the test is also brittle on other environment.

ogrisel · 2024-10-18T15:50:02Z

Ok, so the warning is never raised for macOS pylatest_conda_mkl_no_openmp whatever the choice of the random seed, while it is always raised (as expected) for all the seeds in all the other testing environments. So the problem is not related to the statistical properties of the test problem.

- use ignore_warnings(category=LinAlgWarning) for now; - store n_iter values in local variables to help CI-log debugging; - fix final assertion.

…lbfgs

ogrisel · 2024-10-18T16:08:03Z

I tried to look into the scipy version / BLAS version and so, but I could not find anything that would explain why the warning was not raised only in macOS pylatest_conda_mkl_no_openmp while it works for other environments (with or without MKL, with older and newer versions of scipy and so on).

ogrisel · 2024-10-18T16:24:39Z

Ok, so it seems to be a bug in the propagation of LinAlgWarning but the iteration logic is still somehow respected...

I will push another commit to clean-up the commented out code and keep the ignore_warnings thing for now.

doc/whats_new/upcoming_changes/sklearn.linear_model/30100.fix.rst

sklearn/linear_model/tests/test_logistic.py

FIX properly report n_iter in case of Newton-Cholesky to LBFGS fallback

a28693c

github-actions bot added the module:linear_model label Oct 18, 2024

DOC add a changelog entry

0ca54fb

ogrisel changed the title ~~FIX properly report n_iter in case of Newton-Cholesky to LBFGS fallback~~ FIX properly report n_iter_ in case of fallback from Newton-Cholesky to LBFGS Oct 18, 2024

Make the problem even more ill-conditioned

51bbe41

lorentzenchr approved these changes Oct 18, 2024

View reviewed changes

sklearn/linear_model/tests/test_logistic.py Outdated Show resolved Hide resolved

lorentzenchr added this to the 1.6 milestone Oct 18, 2024

ogrisel and others added 2 commits October 18, 2024 17:34

Update sklearn/linear_model/tests/test_logistic.py

24904c4

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>

[all random seeds]

aaa8010

test_newton_cholesky_fallback_to_lbfgs

ogrisel added 3 commits October 18, 2024 17:52

Fix typo

a94755b

Improve test:

998e76e

- use ignore_warnings(category=LinAlgWarning) for now; - store n_iter values in local variables to help CI-log debugging; - fix final assertion.

[azure parallel] [all random seeds] test_newton_cholesky_fallback_to_…

0e16da2

…lbfgs

Clean-up

423dce8

ogrisel added the Quick Review For PRs that are quick to review label Oct 18, 2024

glemaitre reviewed Oct 18, 2024

View reviewed changes

doc/whats_new/upcoming_changes/sklearn.linear_model/30100.fix.rst Outdated Show resolved Hide resolved

Update doc/whats_new/upcoming_changes/sklearn.linear_model/30100.fix.rst

d9d0cb8

adrinjalali reviewed Oct 21, 2024

View reviewed changes

sklearn/linear_model/tests/test_logistic.py Outdated Show resolved Hide resolved

ogrisel commented Oct 21, 2024

View reviewed changes

sklearn/linear_model/tests/test_logistic.py Outdated Show resolved Hide resolved

Remove comment about max_iter - 1

e0ec69a

adrinjalali approved these changes Oct 21, 2024

View reviewed changes

adrinjalali enabled auto-merge (squash) October 21, 2024 09:50

adrinjalali merged commit 39e1cc1 into scikit-learn:main Oct 21, 2024
28 checks passed

ogrisel deleted the newton-cholesky-to-lbfgs-n_iter branch October 21, 2024 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX properly report `n_iter_` in case of fallback from Newton-Cholesky to LBFGS #30100

FIX properly report `n_iter_` in case of fallback from Newton-Cholesky to LBFGS #30100

ogrisel commented Oct 18, 2024

github-actions bot commented Oct 18, 2024 •

edited

Loading

lorentzenchr left a comment

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

FIX properly report n_iter_ in case of fallback from Newton-Cholesky to LBFGS #30100

FIX properly report n_iter_ in case of fallback from Newton-Cholesky to LBFGS #30100

Conversation

ogrisel commented Oct 18, 2024

github-actions bot commented Oct 18, 2024 • edited Loading

✔️ Linting Passed

lorentzenchr left a comment

Choose a reason for hiding this comment

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

ogrisel commented Oct 18, 2024

FIX properly report `n_iter_` in case of fallback from Newton-Cholesky to LBFGS #30100

FIX properly report `n_iter_` in case of fallback from Newton-Cholesky to LBFGS #30100

github-actions bot commented Oct 18, 2024 •

edited

Loading