Skip to content

BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

kostayScr
Copy link

@kostayScr kostayScr commented Jul 30, 2025

Reference Issues/PRs

Based on draft PR #30031. Closes #30027.

What does this implement/fix? Explain your changes.

Changes the SGD optimization loop in sklearn/linear_model/_sgd_fast.pyx.tp to use correct stopping criteria. Instead of using the raw error(loss), it now uses the full objective value. Full objective includes regularization for regression/classification, and the intercept term for one-class SVM model.
This change prevents incorrect premature stopping of the optimization, often after 6 epochs. Especially pronounced with SGDOneClassSVM, but also affects SGDRegressor and SGDClassifier.
To implement, modifies the WeightVector class to also accumulate L1 norm. Calculates the objective value in the optimization loop.
Also adds an additional test comparing SGDOneClassSVM to liblinear one-class SVM.

Before the fix(example from linked issue):

10k samples, 1000 features
-- Epoch 1
Norm: 0.95, NNZs: 1000, Bias: -5.741972, T: 10000, Avg. loss: 0.000000
Total training time: 0.01 seconds.
-- Epoch 2
Norm: 0.47, NNZs: 1000, Bias: -7.123019, T: 20000, Avg. loss: 0.000000
Total training time: 0.02 seconds.
-- Epoch 3
Norm: 0.32, NNZs: 1000, Bias: -7.932197, T: 30000, Avg. loss: 0.000000
Total training time: 0.03 seconds.
-- Epoch 4
Norm: 0.24, NNZs: 1000, Bias: -8.506685, T: 40000, Avg. loss: 0.000000
Total training time: 0.05 seconds.
-- Epoch 5
Norm: 0.38, NNZs: 1000, Bias: -8.948081, T: 50000, Avg. loss: 0.000001
Total training time: 0.06 seconds.
-- Epoch 6
Norm: 0.32, NNZs: 1000, Bias: -9.312374, T: 60000, Avg. loss: 0.000000
Total training time: 0.07 seconds.
Convergence after 6 epochs took 0.07 seconds

After the fix, model converges:

10k samples, 1000 features
-- Epoch 1
Norm: 0.95, NNZs: 1000, Bias: -5.741972, T: 10000, Avg. loss: 0.000000, Objective: -0.037972
Total training time: 0.01 seconds.
-- Epoch 2
Norm: 0.47, NNZs: 1000, Bias: -7.123019, T: 20000, Avg. loss: 0.000000, Objective: -0.065113
Total training time: 0.02 seconds.
-- Epoch 3
Norm: 0.32, NNZs: 1000, Bias: -7.932197, T: 30000, Avg. loss: 0.000000, Objective: -0.075548
Total training time: 0.04 seconds.
-- Epoch 4
Norm: 0.24, NNZs: 1000, Bias: -8.506685, T: 40000, Avg. loss: 0.000000, Objective: -0.082331
Total training time: 0.05 seconds.
-- Epoch 5
Norm: 0.38, NNZs: 1000, Bias: -8.948072, T: 50000, Avg. loss: 0.000003, Objective: -0.087356
Total training time: 0.06 seconds.
-- Epoch 6
Norm: 0.31, NNZs: 1000, Bias: -9.312364, T: 60000, Avg. loss: 0.000000, Objective: -0.091357
Total training time: 0.08 seconds.
-- Epoch 7
Norm: 0.27, NNZs: 1000, Bias: -9.620415, T: 70000, Avg. loss: 0.000000, Objective: -0.094703
Total training time: 0.09 seconds.
-- Epoch 8
Norm: 0.24, NNZs: 1000, Bias: -9.887290, T: 80000, Avg. loss: 0.000000, Objective: -0.097568
Total training time: 0.10 seconds.
-- Epoch 9
Norm: 0.31, NNZs: 1000, Bias: -10.120255, T: 90000, Avg. loss: 0.000002, Objective: -0.100050
Total training time: 0.12 seconds.
-- Epoch 10
Norm: 0.28, NNZs: 1000, Bias: -10.330859, T: 100000, Avg. loss: 0.000000, Objective: -0.102274
Total training time: 0.14 seconds.
-- Epoch 11
Norm: 0.26, NNZs: 1000, Bias: -10.521383, T: 110000, Avg. loss: 0.000000, Objective: -0.104276
Total training time: 0.16 seconds.
-- Epoch 12
Norm: 0.31, NNZs: 1000, Bias: -10.693581, T: 120000, Avg. loss: 0.000002, Objective: -0.106084
Total training time: 0.17 seconds.
-- Epoch 13
Norm: 0.29, NNZs: 1000, Bias: -10.853599, T: 130000, Avg. loss: 0.000000, Objective: -0.107746
Total training time: 0.18 seconds.
-- Epoch 14
Norm: 0.27, NNZs: 1000, Bias: -11.001757, T: 140000, Avg. loss: 0.000000, Objective: -0.109286
Total training time: 0.19 seconds.
-- Epoch 15
Norm: 0.31, NNZs: 1000, Bias: -11.138324, T: 150000, Avg. loss: 0.000000, Objective: -0.110710
Total training time: 0.20 seconds.
-- Epoch 16
Norm: 0.29, NNZs: 1000, Bias: -11.267358, T: 160000, Avg. loss: 0.000000, Objective: -0.112035
Total training time: 0.22 seconds.
-- Epoch 17
Norm: 0.28, NNZs: 1000, Bias: -11.388568, T: 170000, Avg. loss: 0.000000, Objective: -0.113286
Total training time: 0.23 seconds.
-- Epoch 18
Norm: 0.31, NNZs: 1000, Bias: -11.501724, T: 180000, Avg. loss: 0.000000, Objective: -0.114460
Total training time: 0.24 seconds.
-- Epoch 19
Norm: 0.30, NNZs: 1000, Bias: -11.609828, T: 190000, Avg. loss: 0.000000, Objective: -0.115563
Total training time: 0.25 seconds.
-- Epoch 20
Norm: 0.28, NNZs: 1000, Bias: -11.712387, T: 200000, Avg. loss: 0.000000, Objective: -0.116615
Total training time: 0.26 seconds.
-- Epoch 21
Norm: 0.31, NNZs: 1000, Bias: -11.808978, T: 210000, Avg. loss: 0.000001, Objective: -0.117612
Total training time: 0.27 seconds.
-- Epoch 22
Norm: 0.30, NNZs: 1000, Bias: -11.901996, T: 220000, Avg. loss: 0.000000, Objective: -0.118558
Total training time: 0.29 seconds.
-- Epoch 23
Norm: 0.29, NNZs: 1000, Bias: -11.990878, T: 230000, Avg. loss: 0.000000, Objective: -0.119468
Total training time: 0.30 seconds.
-- Epoch 24
Norm: 0.31, NNZs: 1000, Bias: -12.075135, T: 240000, Avg. loss: 0.000000, Objective: -0.120335
Total training time: 0.31 seconds.
-- Epoch 25
Norm: 0.30, NNZs: 1000, Bias: -12.156761, T: 250000, Avg. loss: 0.000000, Objective: -0.121162
Total training time: 0.32 seconds.
Convergence after 25 epochs took 0.32 seconds

See linked issue for full code.

Any other comments?

This PR probably needs a changelog entry, since the output of SGD models(regressor, classifier, one class) can change for tol != None .

Copy link

github-actions bot commented Jul 30, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 4ac8494. Link to the linter CI: here

@kostayScr
Copy link
Author

kostayScr commented Jul 31, 2025

Had to fix a test, that was not passing - large loss/objective spikes during convergence, due to tiny sample size.
Output of failing test_multi_output_classification_partial_fit_sample_weights() (look at the end):

----------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------
-- Epoch 1
Norm: 22.32, NNZs: 3, Bias: 10.019960, T: 3, Avg. loss: 220.453546, Objective: 220.640027
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 43.09, NNZs: 3, Bias: 29.950229, T: 6, Avg. loss: 220.298896, Objective: 220.414816
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 82.26, NNZs: 3, Bias: 20.029594, T: 9, Avg. loss: 55.003933, Objective: 55.096568
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 75.98, NNZs: 3, Bias: 39.841406, T: 12, Avg. loss: 191.670434, Objective: 192.014498
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 97.63, NNZs: 3, Bias: 49.742319, T: 15, Avg. loss: 129.069953, Objective: 129.410023
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 106.24, NNZs: 3, Bias: 69.437075, T: 18, Avg. loss: 163.360849, Objective: 163.915589
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 105.93, NNZs: 3, Bias: 69.437075, T: 21, Avg. loss: 0.000000, Objective: 0.563290
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 105.62, NNZs: 3, Bias: 69.437075, T: 24, Avg. loss: 0.000000, Objective: 0.559985
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 105.31, NNZs: 3, Bias: 69.437075, T: 27, Avg. loss: 0.000000, Objective: 0.556708
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 105.01, NNZs: 3, Bias: 69.437075, T: 30, Avg. loss: 0.000000, Objective: 0.553461
Total training time: 0.00 seconds.
-- Epoch 11
Norm: 104.70, NNZs: 3, Bias: 69.437075, T: 33, Avg. loss: 0.000000, Objective: 0.550241
Total training time: 0.00 seconds.
-- Epoch 12
Norm: 104.40, NNZs: 3, Bias: 69.437075, T: 36, Avg. loss: 0.000000, Objective: 0.547050
Total training time: 0.00 seconds.
-- Epoch 13
Norm: 104.10, NNZs: 3, Bias: 69.437075, T: 39, Avg. loss: 0.000000, Objective: 0.543886
Total training time: 0.00 seconds.
-- Epoch 14
Norm: 103.80, NNZs: 3, Bias: 69.437075, T: 42, Avg. loss: 0.000000, Objective: 0.540750
Total training time: 0.00 seconds.
-- Epoch 15
Norm: 103.50, NNZs: 3, Bias: 69.437075, T: 45, Avg. loss: 0.000000, Objective: 0.537641
Total training time: 0.00 seconds.
-- Epoch 16
Norm: 103.20, NNZs: 3, Bias: 69.437075, T: 48, Avg. loss: 0.000000, Objective: 0.534558
Total training time: 0.00 seconds.
-- Epoch 17
Norm: 102.91, NNZs: 3, Bias: 69.437075, T: 51, Avg. loss: 0.000000, Objective: 0.531502
Total training time: 0.00 seconds.
-- Epoch 18
Norm: 102.61, NNZs: 3, Bias: 69.437075, T: 54, Avg. loss: 0.000000, Objective: 0.528472
Total training time: 0.00 seconds.
-- Epoch 19
Norm: 102.32, NNZs: 3, Bias: 69.437075, T: 57, Avg. loss: 0.000000, Objective: 0.525468
Total training time: 0.00 seconds.
-- Epoch 20
Norm: 102.03, NNZs: 3, Bias: 69.437075, T: 60, Avg. loss: 0.000000, Objective: 0.522490
Total training time: 0.00 seconds.
-- Epoch 1
Norm: 22.32, NNZs: 3, Bias: -10.019960, T: 3, Avg. loss: 220.453546, Objective: 220.640027
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 43.09, NNZs: 3, Bias: -29.950229, T: 6, Avg. loss: 220.298896, Objective: 220.414816
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 82.26, NNZs: 3, Bias: -20.029594, T: 9, Avg. loss: 55.003933, Objective: 55.096568
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 75.98, NNZs: 3, Bias: -39.841406, T: 12, Avg. loss: 191.670434, Objective: 192.014498
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 97.63, NNZs: 3, Bias: -49.742319, T: 15, Avg. loss: 129.069953, Objective: 129.410023
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 106.24, NNZs: 3, Bias: -69.437075, T: 18, Avg. loss: 163.360849, Objective: 163.915589
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 105.93, NNZs: 3, Bias: -69.437075, T: 21, Avg. loss: 0.000000, Objective: 0.563290
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 105.62, NNZs: 3, Bias: -69.437075, T: 24, Avg. loss: 0.000000, Objective: 0.559985
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 105.31, NNZs: 3, Bias: -69.437075, T: 27, Avg. loss: 0.000000, Objective: 0.556708
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 105.01, NNZs: 3, Bias: -69.437075, T: 30, Avg. loss: 0.000000, Objective: 0.553461
Total training time: 0.00 seconds.
-- Epoch 11
Norm: 104.70, NNZs: 3, Bias: -69.437075, T: 33, Avg. loss: 0.000000, Objective: 0.550241
Total training time: 0.00 seconds.
-- Epoch 12
Norm: 104.40, NNZs: 3, Bias: -69.437075, T: 36, Avg. loss: 0.000000, Objective: 0.547050
Total training time: 0.00 seconds.
-- Epoch 13
Norm: 104.10, NNZs: 3, Bias: -69.437075, T: 39, Avg. loss: 0.000000, Objective: 0.543886
Total training time: 0.00 seconds.
-- Epoch 14
Norm: 103.80, NNZs: 3, Bias: -69.437075, T: 42, Avg. loss: 0.000000, Objective: 0.540750
Total training time: 0.00 seconds.
-- Epoch 15
Norm: 103.50, NNZs: 3, Bias: -69.437075, T: 45, Avg. loss: 0.000000, Objective: 0.537641
Total training time: 0.00 seconds.
-- Epoch 16
Norm: 103.20, NNZs: 3, Bias: -69.437075, T: 48, Avg. loss: 0.000000, Objective: 0.534558
Total training time: 0.00 seconds.
-- Epoch 17
Norm: 102.91, NNZs: 3, Bias: -69.437075, T: 51, Avg. loss: 0.000000, Objective: 0.531502
Total training time: 0.00 seconds.
-- Epoch 18
Norm: 102.61, NNZs: 3, Bias: -69.437075, T: 54, Avg. loss: 0.000000, Objective: 0.528472
Total training time: 0.00 seconds.
-- Epoch 19
Norm: 102.32, NNZs: 3, Bias: -69.437075, T: 57, Avg. loss: 0.000000, Objective: 0.525468
Total training time: 0.00 seconds.
-- Epoch 20
Norm: 102.03, NNZs: 3, Bias: -69.437075, T: 60, Avg. loss: 0.000000, Objective: 0.522490
Total training time: 0.00 seconds.
-- Epoch 1
Norm: 38.29, NNZs: 3, Bias: 19.940140, T: 4, Avg. loss: 139.687585, Objective: 139.823743
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 32.94, NNZs: 3, Bias: 19.930268, T: 8, Avg. loss: 129.168907, Objective: 129.271641
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 45.33, NNZs: 3, Bias: 29.841071, T: 12, Avg. loss: 0.227750, Objective: 0.306359
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 49.01, NNZs: 3, Bias: 29.831355, T: 16, Avg. loss: 118.894306, Objective: 119.039280
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 56.16, NNZs: 3, Bias: 39.664197, T: 20, Avg. loss: 0.174051, Objective: 0.313135
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 64.84, NNZs: 3, Bias: 39.654632, T: 24, Avg. loss: 108.780776, Objective: 108.993065
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 68.85, NNZs: 3, Bias: 49.410729, T: 28, Avg. loss: 0.101967, Objective: 0.325835
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 80.42, NNZs: 3, Bias: 49.401313, T: 32, Avg. loss: 98.824559, Objective: 99.128056
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds
-- Epoch 1
Norm: 38.29, NNZs: 3, Bias: -19.940140, T: 4, Avg. loss: 139.687585, Objective: 139.823743
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 32.94, NNZs: 3, Bias: -19.930268, T: 8, Avg. loss: 129.168907, Objective: 129.271641
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 45.33, NNZs: 3, Bias: -29.841071, T: 12, Avg. loss: 0.227750, Objective: 0.306359
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 49.01, NNZs: 3, Bias: -29.831355, T: 16, Avg. loss: 118.894306, Objective: 119.039280
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 56.16, NNZs: 3, Bias: -39.664197, T: 20, Avg. loss: 0.174051, Objective: 0.313135
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 64.84, NNZs: 3, Bias: -39.654632, T: 24, Avg. loss: 108.780776, Objective: 108.993065
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 68.85, NNZs: 3, Bias: -49.410729, T: 28, Avg. loss: 0.101967, Objective: 0.325835
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 80.42, NNZs: 3, Bias: -49.401313, T: 32, Avg. loss: 98.824559, Objective: 99.128056
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds
preds 1 unweighted [[2 3]]
preds 2 weighted [[3 2]]

At the end, epoch 8, did not converge.
After setting tol=None, so that full 20 epochs run, it works.

Same kind of issue with doctests, verbose output of doctest from _stochastic_gradient.py, SGDOneClassSVM, first without fix:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 0.26, NNZs: 2, Bias: 1.196683, T: 20, Avg. loss: 0.205538
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 0.00, NNZs: 2, Bias: 1.028259, T: 24, Avg. loss: 0.077648
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 0.00, NNZs: 2, Bias: 1.023254, T: 28, Avg. loss: 0.195961
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 0.17, NNZs: 2, Bias: 1.141260, T: 32, Avg. loss: 0.139745
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds

With PR:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272, Objective: 2.118822
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961, Objective: 1.103998
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000, Objective: 0.685541
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821, Objective: 0.666610
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 0.26, NNZs: 2, Bias: 1.196683, T: 20, Avg. loss: 0.205538, Objective: 0.760828
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 0.00, NNZs: 2, Bias: 1.028259, T: 24, Avg. loss: 0.077648, Objective: 0.638001
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 0.00, NNZs: 2, Bias: 1.023254, T: 28, Avg. loss: 0.195961, Objective: 0.697667
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 0.17, NNZs: 2, Bias: 1.141260, T: 32, Avg. loss: 0.139745, Objective: 0.653300
Total training time: 0.01 seconds.
-- Epoch 9
Norm: 0.00, NNZs: 2, Bias: 1.035687, T: 36, Avg. loss: 0.023807, Objective: 0.596101
Total training time: 0.01 seconds.
-- Epoch 10
Norm: 0.14, NNZs: 2, Bias: 1.131193, T: 40, Avg. loss: 0.098821, Objective: 0.617901
Total training time: 0.01 seconds.
-- Epoch 11
Norm: 0.00, NNZs: 2, Bias: 1.044003, T: 44, Avg. loss: 0.015016, Objective: 0.581711
Total training time: 0.01 seconds.
-- Epoch 12
Norm: 0.11, NNZs: 2, Bias: 0.960300, T: 48, Avg. loss: 0.010130, Objective: 0.511202
Total training time: 0.01 seconds.
-- Epoch 13
Norm: 0.00, NNZs: 2, Bias: 1.037534, T: 52, Avg. loss: 0.144702, Objective: 0.646752
Total training time: 0.01 seconds.
-- Epoch 14
Norm: 0.10, NNZs: 2, Bias: 0.965841, T: 56, Avg. loss: 0.008692, Objective: 0.509533
Total training time: 0.01 seconds.
-- Epoch 15
Norm: 0.00, NNZs: 2, Bias: 1.032735, T: 60, Avg. loss: 0.125267, Objective: 0.626857
Total training time: 0.01 seconds.
-- Epoch 16
Norm: 0.09, NNZs: 2, Bias: 0.970037, T: 64, Avg. loss: 0.007608, Objective: 0.508299
Total training time: 0.01 seconds.
-- Epoch 17
Norm: 0.00, NNZs: 2, Bias: 1.029035, T: 68, Avg. loss: 0.110428, Objective: 0.611710
Total training time: 0.01 seconds.
-- Epoch 18
Norm: 0.08, NNZs: 2, Bias: 0.973325, T: 72, Avg. loss: 0.006762, Objective: 0.507350
Total training time: 0.01 seconds.
-- Epoch 19
Norm: 0.05, NNZs: 2, Bias: 1.025407, T: 76, Avg. loss: 0.079460, Objective: 0.573158
Total training time: 0.01 seconds.
-- Epoch 20
Norm: 0.05, NNZs: 2, Bias: 0.975284, T: 80, Avg. loss: 0.018781, Objective: 0.519118
Total training time: 0.01 seconds.
-- Epoch 21
Norm: 0.05, NNZs: 2, Bias: 1.023014, T: 84, Avg. loss: 0.078126, Objective: 0.578269
Total training time: 0.01 seconds.
Convergence after 21 epochs took 0.02 seconds

Takes more epochs, but because too few samples per epoch, the iteration after which the optimization stops is pretty random.

With tol=None, it converges:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272, Objective: 2.118822
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961, Objective: 1.103998
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000, Objective: 0.685541
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821, Objective: 0.666610
Total training time: 0.00 seconds.

...

-- Epoch 994
Norm: 0.00, NNZs: 2, Bias: 0.999215, T: 3976, Avg. loss: 0.000448, Objective: 0.500307
Total training time: 0.25 seconds.
-- Epoch 995
Norm: 0.00, NNZs: 2, Bias: 1.000221, T: 3980, Avg. loss: 0.001845, Objective: 0.501704
Total training time: 0.25 seconds.
-- Epoch 996
Norm: 0.00, NNZs: 2, Bias: 0.999216, T: 3984, Avg. loss: 0.000447, Objective: 0.500306
Total training time: 0.25 seconds.
-- Epoch 997
Norm: 0.00, NNZs: 2, Bias: 1.000220, T: 3988, Avg. loss: 0.001841, Objective: 0.501701
Total training time: 0.25 seconds.
-- Epoch 998
Norm: 0.00, NNZs: 2, Bias: 0.999217, T: 3992, Avg. loss: 0.000446, Objective: 0.500305
Total training time: 0.25 seconds.
-- Epoch 999
Norm: 0.00, NNZs: 2, Bias: 1.000219, T: 3996, Avg. loss: 0.001838, Objective: 0.501697
Total training time: 0.25 seconds.
-- Epoch 1000
Norm: 0.00, NNZs: 2, Bias: 0.999218, T: 4000, Avg. loss: 0.000445, Objective: 0.500305
Total training time: 0.25 seconds.

@kostayScr kostayScr marked this pull request as draft July 31, 2025 10:54
@kostayScr kostayScr changed the title Fix SGD convergence criteria BUG Fix SGD convergence criteria Jul 31, 2025
@kostayScr kostayScr changed the title BUG Fix SGD convergence criteria BUG: Fix SGD convergence criteria Jul 31, 2025
@kostayScr kostayScr marked this pull request as ready for review July 31, 2025 16:22
@kostayScr kostayScr changed the title BUG: Fix SGD convergence criteria BUG: Fix SGD models(SGDRegressor etc.) convergence criteria Aug 3, 2025
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @antoinebaker or @lorentzenchr could have a look.

This also needs a FIX changelog

@@ -2220,9 +2220,9 @@ class SGDOneClassSVM(OutlierMixin, BaseSGD):
>>> import numpy as np
>>> from sklearn import linear_model
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> clf = linear_model.SGDOneClassSVM(random_state=42)
>>> clf = linear_model.SGDOneClassSVM(random_state=42, tol=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why these changes are necessary

Copy link
Author

@kostayScr kostayScr Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to make the doctest pass. Tiny sample size of 4 with SGD leads to large loss fluctuations each epoch(each epoch only 4 updates), so the no improvement in 5 epochs default stopping criteria is meaningless. It "converges" by luck, so setting tol=None fixes that. Too few samples/updates per epoch to get meaningful estimate of loss. See the comment above #31856 (comment).

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SGDOneClassSVM model does not converge with default stopping criteria(stops prematurely)
3 participants