BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

kostayScr · 2025-07-30T18:06:38Z

Reference Issues/PRs

Based on draft PR #30031. Closes #30027.

What does this implement/fix? Explain your changes.

Changes the SGD optimization loop in sklearn/linear_model/_sgd_fast.pyx.tp to use correct stopping criteria. Instead of using the raw error(loss), it now uses the full objective value. Full objective includes regularization for regression/classification, and the intercept term for one-class SVM model.
This change prevents incorrect premature stopping of the optimization, often after 6 epochs. Especially pronounced with SGDOneClassSVM, but also affects SGDRegressor and SGDClassifier.
To implement, modifies the WeightVector class to also accumulate L1 norm. Calculates the objective value in the optimization loop.
Also adds an additional test comparing SGDOneClassSVM to liblinear one-class SVM.

Before the fix(example from linked issue):

10k samples, 1000 features
-- Epoch 1
Norm: 0.95, NNZs: 1000, Bias: -5.741972, T: 10000, Avg. loss: 0.000000
Total training time: 0.01 seconds.
-- Epoch 2
Norm: 0.47, NNZs: 1000, Bias: -7.123019, T: 20000, Avg. loss: 0.000000
Total training time: 0.02 seconds.
-- Epoch 3
Norm: 0.32, NNZs: 1000, Bias: -7.932197, T: 30000, Avg. loss: 0.000000
Total training time: 0.03 seconds.
-- Epoch 4
Norm: 0.24, NNZs: 1000, Bias: -8.506685, T: 40000, Avg. loss: 0.000000
Total training time: 0.05 seconds.
-- Epoch 5
Norm: 0.38, NNZs: 1000, Bias: -8.948081, T: 50000, Avg. loss: 0.000001
Total training time: 0.06 seconds.
-- Epoch 6
Norm: 0.32, NNZs: 1000, Bias: -9.312374, T: 60000, Avg. loss: 0.000000
Total training time: 0.07 seconds.
Convergence after 6 epochs took 0.07 seconds

After the fix, model converges:

10k samples, 1000 features
-- Epoch 1
Norm: 0.95, NNZs: 1000, Bias: -5.741972, T: 10000, Avg. loss: 0.000000, Objective: -0.037972
Total training time: 0.01 seconds.
-- Epoch 2
Norm: 0.47, NNZs: 1000, Bias: -7.123019, T: 20000, Avg. loss: 0.000000, Objective: -0.065113
Total training time: 0.02 seconds.
-- Epoch 3
Norm: 0.32, NNZs: 1000, Bias: -7.932197, T: 30000, Avg. loss: 0.000000, Objective: -0.075548
Total training time: 0.04 seconds.
-- Epoch 4
Norm: 0.24, NNZs: 1000, Bias: -8.506685, T: 40000, Avg. loss: 0.000000, Objective: -0.082331
Total training time: 0.05 seconds.
-- Epoch 5
Norm: 0.38, NNZs: 1000, Bias: -8.948072, T: 50000, Avg. loss: 0.000003, Objective: -0.087356
Total training time: 0.06 seconds.
-- Epoch 6
Norm: 0.31, NNZs: 1000, Bias: -9.312364, T: 60000, Avg. loss: 0.000000, Objective: -0.091357
Total training time: 0.08 seconds.
-- Epoch 7
Norm: 0.27, NNZs: 1000, Bias: -9.620415, T: 70000, Avg. loss: 0.000000, Objective: -0.094703
Total training time: 0.09 seconds.
-- Epoch 8
Norm: 0.24, NNZs: 1000, Bias: -9.887290, T: 80000, Avg. loss: 0.000000, Objective: -0.097568
Total training time: 0.10 seconds.
-- Epoch 9
Norm: 0.31, NNZs: 1000, Bias: -10.120255, T: 90000, Avg. loss: 0.000002, Objective: -0.100050
Total training time: 0.12 seconds.
-- Epoch 10
Norm: 0.28, NNZs: 1000, Bias: -10.330859, T: 100000, Avg. loss: 0.000000, Objective: -0.102274
Total training time: 0.14 seconds.
-- Epoch 11
Norm: 0.26, NNZs: 1000, Bias: -10.521383, T: 110000, Avg. loss: 0.000000, Objective: -0.104276
Total training time: 0.16 seconds.
-- Epoch 12
Norm: 0.31, NNZs: 1000, Bias: -10.693581, T: 120000, Avg. loss: 0.000002, Objective: -0.106084
Total training time: 0.17 seconds.
-- Epoch 13
Norm: 0.29, NNZs: 1000, Bias: -10.853599, T: 130000, Avg. loss: 0.000000, Objective: -0.107746
Total training time: 0.18 seconds.
-- Epoch 14
Norm: 0.27, NNZs: 1000, Bias: -11.001757, T: 140000, Avg. loss: 0.000000, Objective: -0.109286
Total training time: 0.19 seconds.
-- Epoch 15
Norm: 0.31, NNZs: 1000, Bias: -11.138324, T: 150000, Avg. loss: 0.000000, Objective: -0.110710
Total training time: 0.20 seconds.
-- Epoch 16
Norm: 0.29, NNZs: 1000, Bias: -11.267358, T: 160000, Avg. loss: 0.000000, Objective: -0.112035
Total training time: 0.22 seconds.
-- Epoch 17
Norm: 0.28, NNZs: 1000, Bias: -11.388568, T: 170000, Avg. loss: 0.000000, Objective: -0.113286
Total training time: 0.23 seconds.
-- Epoch 18
Norm: 0.31, NNZs: 1000, Bias: -11.501724, T: 180000, Avg. loss: 0.000000, Objective: -0.114460
Total training time: 0.24 seconds.
-- Epoch 19
Norm: 0.30, NNZs: 1000, Bias: -11.609828, T: 190000, Avg. loss: 0.000000, Objective: -0.115563
Total training time: 0.25 seconds.
-- Epoch 20
Norm: 0.28, NNZs: 1000, Bias: -11.712387, T: 200000, Avg. loss: 0.000000, Objective: -0.116615
Total training time: 0.26 seconds.
-- Epoch 21
Norm: 0.31, NNZs: 1000, Bias: -11.808978, T: 210000, Avg. loss: 0.000001, Objective: -0.117612
Total training time: 0.27 seconds.
-- Epoch 22
Norm: 0.30, NNZs: 1000, Bias: -11.901996, T: 220000, Avg. loss: 0.000000, Objective: -0.118558
Total training time: 0.29 seconds.
-- Epoch 23
Norm: 0.29, NNZs: 1000, Bias: -11.990878, T: 230000, Avg. loss: 0.000000, Objective: -0.119468
Total training time: 0.30 seconds.
-- Epoch 24
Norm: 0.31, NNZs: 1000, Bias: -12.075135, T: 240000, Avg. loss: 0.000000, Objective: -0.120335
Total training time: 0.31 seconds.
-- Epoch 25
Norm: 0.30, NNZs: 1000, Bias: -12.156761, T: 250000, Avg. loss: 0.000000, Objective: -0.121162
Total training time: 0.32 seconds.
Convergence after 25 epochs took 0.32 seconds

See linked issue for full code.

Any other comments?

This PR probably needs a changelog entry, since the output of SGD models(regressor, classifier, one class) can change for tol != None .

github-actions · 2025-07-30T18:07:30Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 4ac8494. Link to the linter CI: here}

kostayScr · 2025-07-31T08:05:04Z

Had to fix a test, that was not passing - large loss/objective spikes during convergence, due to tiny sample size.
Output of failing test_multi_output_classification_partial_fit_sample_weights() (look at the end):

----------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------
-- Epoch 1
Norm: 22.32, NNZs: 3, Bias: 10.019960, T: 3, Avg. loss: 220.453546, Objective: 220.640027
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 43.09, NNZs: 3, Bias: 29.950229, T: 6, Avg. loss: 220.298896, Objective: 220.414816
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 82.26, NNZs: 3, Bias: 20.029594, T: 9, Avg. loss: 55.003933, Objective: 55.096568
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 75.98, NNZs: 3, Bias: 39.841406, T: 12, Avg. loss: 191.670434, Objective: 192.014498
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 97.63, NNZs: 3, Bias: 49.742319, T: 15, Avg. loss: 129.069953, Objective: 129.410023
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 106.24, NNZs: 3, Bias: 69.437075, T: 18, Avg. loss: 163.360849, Objective: 163.915589
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 105.93, NNZs: 3, Bias: 69.437075, T: 21, Avg. loss: 0.000000, Objective: 0.563290
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 105.62, NNZs: 3, Bias: 69.437075, T: 24, Avg. loss: 0.000000, Objective: 0.559985
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 105.31, NNZs: 3, Bias: 69.437075, T: 27, Avg. loss: 0.000000, Objective: 0.556708
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 105.01, NNZs: 3, Bias: 69.437075, T: 30, Avg. loss: 0.000000, Objective: 0.553461
Total training time: 0.00 seconds.
-- Epoch 11
Norm: 104.70, NNZs: 3, Bias: 69.437075, T: 33, Avg. loss: 0.000000, Objective: 0.550241
Total training time: 0.00 seconds.
-- Epoch 12
Norm: 104.40, NNZs: 3, Bias: 69.437075, T: 36, Avg. loss: 0.000000, Objective: 0.547050
Total training time: 0.00 seconds.
-- Epoch 13
Norm: 104.10, NNZs: 3, Bias: 69.437075, T: 39, Avg. loss: 0.000000, Objective: 0.543886
Total training time: 0.00 seconds.
-- Epoch 14
Norm: 103.80, NNZs: 3, Bias: 69.437075, T: 42, Avg. loss: 0.000000, Objective: 0.540750
Total training time: 0.00 seconds.
-- Epoch 15
Norm: 103.50, NNZs: 3, Bias: 69.437075, T: 45, Avg. loss: 0.000000, Objective: 0.537641
Total training time: 0.00 seconds.
-- Epoch 16
Norm: 103.20, NNZs: 3, Bias: 69.437075, T: 48, Avg. loss: 0.000000, Objective: 0.534558
Total training time: 0.00 seconds.
-- Epoch 17
Norm: 102.91, NNZs: 3, Bias: 69.437075, T: 51, Avg. loss: 0.000000, Objective: 0.531502
Total training time: 0.00 seconds.
-- Epoch 18
Norm: 102.61, NNZs: 3, Bias: 69.437075, T: 54, Avg. loss: 0.000000, Objective: 0.528472
Total training time: 0.00 seconds.
-- Epoch 19
Norm: 102.32, NNZs: 3, Bias: 69.437075, T: 57, Avg. loss: 0.000000, Objective: 0.525468
Total training time: 0.00 seconds.
-- Epoch 20
Norm: 102.03, NNZs: 3, Bias: 69.437075, T: 60, Avg. loss: 0.000000, Objective: 0.522490
Total training time: 0.00 seconds.
-- Epoch 1
Norm: 22.32, NNZs: 3, Bias: -10.019960, T: 3, Avg. loss: 220.453546, Objective: 220.640027
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 43.09, NNZs: 3, Bias: -29.950229, T: 6, Avg. loss: 220.298896, Objective: 220.414816
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 82.26, NNZs: 3, Bias: -20.029594, T: 9, Avg. loss: 55.003933, Objective: 55.096568
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 75.98, NNZs: 3, Bias: -39.841406, T: 12, Avg. loss: 191.670434, Objective: 192.014498
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 97.63, NNZs: 3, Bias: -49.742319, T: 15, Avg. loss: 129.069953, Objective: 129.410023
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 106.24, NNZs: 3, Bias: -69.437075, T: 18, Avg. loss: 163.360849, Objective: 163.915589
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 105.93, NNZs: 3, Bias: -69.437075, T: 21, Avg. loss: 0.000000, Objective: 0.563290
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 105.62, NNZs: 3, Bias: -69.437075, T: 24, Avg. loss: 0.000000, Objective: 0.559985
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 105.31, NNZs: 3, Bias: -69.437075, T: 27, Avg. loss: 0.000000, Objective: 0.556708
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 105.01, NNZs: 3, Bias: -69.437075, T: 30, Avg. loss: 0.000000, Objective: 0.553461
Total training time: 0.00 seconds.
-- Epoch 11
Norm: 104.70, NNZs: 3, Bias: -69.437075, T: 33, Avg. loss: 0.000000, Objective: 0.550241
Total training time: 0.00 seconds.
-- Epoch 12
Norm: 104.40, NNZs: 3, Bias: -69.437075, T: 36, Avg. loss: 0.000000, Objective: 0.547050
Total training time: 0.00 seconds.
-- Epoch 13
Norm: 104.10, NNZs: 3, Bias: -69.437075, T: 39, Avg. loss: 0.000000, Objective: 0.543886
Total training time: 0.00 seconds.
-- Epoch 14
Norm: 103.80, NNZs: 3, Bias: -69.437075, T: 42, Avg. loss: 0.000000, Objective: 0.540750
Total training time: 0.00 seconds.
-- Epoch 15
Norm: 103.50, NNZs: 3, Bias: -69.437075, T: 45, Avg. loss: 0.000000, Objective: 0.537641
Total training time: 0.00 seconds.
-- Epoch 16
Norm: 103.20, NNZs: 3, Bias: -69.437075, T: 48, Avg. loss: 0.000000, Objective: 0.534558
Total training time: 0.00 seconds.
-- Epoch 17
Norm: 102.91, NNZs: 3, Bias: -69.437075, T: 51, Avg. loss: 0.000000, Objective: 0.531502
Total training time: 0.00 seconds.
-- Epoch 18
Norm: 102.61, NNZs: 3, Bias: -69.437075, T: 54, Avg. loss: 0.000000, Objective: 0.528472
Total training time: 0.00 seconds.
-- Epoch 19
Norm: 102.32, NNZs: 3, Bias: -69.437075, T: 57, Avg. loss: 0.000000, Objective: 0.525468
Total training time: 0.00 seconds.
-- Epoch 20
Norm: 102.03, NNZs: 3, Bias: -69.437075, T: 60, Avg. loss: 0.000000, Objective: 0.522490
Total training time: 0.00 seconds.
-- Epoch 1
Norm: 38.29, NNZs: 3, Bias: 19.940140, T: 4, Avg. loss: 139.687585, Objective: 139.823743
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 32.94, NNZs: 3, Bias: 19.930268, T: 8, Avg. loss: 129.168907, Objective: 129.271641
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 45.33, NNZs: 3, Bias: 29.841071, T: 12, Avg. loss: 0.227750, Objective: 0.306359
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 49.01, NNZs: 3, Bias: 29.831355, T: 16, Avg. loss: 118.894306, Objective: 119.039280
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 56.16, NNZs: 3, Bias: 39.664197, T: 20, Avg. loss: 0.174051, Objective: 0.313135
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 64.84, NNZs: 3, Bias: 39.654632, T: 24, Avg. loss: 108.780776, Objective: 108.993065
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 68.85, NNZs: 3, Bias: 49.410729, T: 28, Avg. loss: 0.101967, Objective: 0.325835
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 80.42, NNZs: 3, Bias: 49.401313, T: 32, Avg. loss: 98.824559, Objective: 99.128056
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds
-- Epoch 1
Norm: 38.29, NNZs: 3, Bias: -19.940140, T: 4, Avg. loss: 139.687585, Objective: 139.823743
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 32.94, NNZs: 3, Bias: -19.930268, T: 8, Avg. loss: 129.168907, Objective: 129.271641
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 45.33, NNZs: 3, Bias: -29.841071, T: 12, Avg. loss: 0.227750, Objective: 0.306359
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 49.01, NNZs: 3, Bias: -29.831355, T: 16, Avg. loss: 118.894306, Objective: 119.039280
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 56.16, NNZs: 3, Bias: -39.664197, T: 20, Avg. loss: 0.174051, Objective: 0.313135
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 64.84, NNZs: 3, Bias: -39.654632, T: 24, Avg. loss: 108.780776, Objective: 108.993065
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 68.85, NNZs: 3, Bias: -49.410729, T: 28, Avg. loss: 0.101967, Objective: 0.325835
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 80.42, NNZs: 3, Bias: -49.401313, T: 32, Avg. loss: 98.824559, Objective: 99.128056
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds
preds 1 unweighted [[2 3]]
preds 2 weighted [[3 2]]

At the end, epoch 8, did not converge.
After setting tol=None, so that full 20 epochs run, it works.

Same kind of issue with doctests, verbose output of doctest from _stochastic_gradient.py, SGDOneClassSVM, first without fix:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 0.26, NNZs: 2, Bias: 1.196683, T: 20, Avg. loss: 0.205538
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 0.00, NNZs: 2, Bias: 1.028259, T: 24, Avg. loss: 0.077648
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 0.00, NNZs: 2, Bias: 1.023254, T: 28, Avg. loss: 0.195961
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 0.17, NNZs: 2, Bias: 1.141260, T: 32, Avg. loss: 0.139745
Total training time: 0.00 seconds.
Convergence after 8 epochs took 0.00 seconds

With PR:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272, Objective: 2.118822
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961, Objective: 1.103998
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000, Objective: 0.685541
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821, Objective: 0.666610
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 0.26, NNZs: 2, Bias: 1.196683, T: 20, Avg. loss: 0.205538, Objective: 0.760828
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 0.00, NNZs: 2, Bias: 1.028259, T: 24, Avg. loss: 0.077648, Objective: 0.638001
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 0.00, NNZs: 2, Bias: 1.023254, T: 28, Avg. loss: 0.195961, Objective: 0.697667
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 0.17, NNZs: 2, Bias: 1.141260, T: 32, Avg. loss: 0.139745, Objective: 0.653300
Total training time: 0.01 seconds.
-- Epoch 9
Norm: 0.00, NNZs: 2, Bias: 1.035687, T: 36, Avg. loss: 0.023807, Objective: 0.596101
Total training time: 0.01 seconds.
-- Epoch 10
Norm: 0.14, NNZs: 2, Bias: 1.131193, T: 40, Avg. loss: 0.098821, Objective: 0.617901
Total training time: 0.01 seconds.
-- Epoch 11
Norm: 0.00, NNZs: 2, Bias: 1.044003, T: 44, Avg. loss: 0.015016, Objective: 0.581711
Total training time: 0.01 seconds.
-- Epoch 12
Norm: 0.11, NNZs: 2, Bias: 0.960300, T: 48, Avg. loss: 0.010130, Objective: 0.511202
Total training time: 0.01 seconds.
-- Epoch 13
Norm: 0.00, NNZs: 2, Bias: 1.037534, T: 52, Avg. loss: 0.144702, Objective: 0.646752
Total training time: 0.01 seconds.
-- Epoch 14
Norm: 0.10, NNZs: 2, Bias: 0.965841, T: 56, Avg. loss: 0.008692, Objective: 0.509533
Total training time: 0.01 seconds.
-- Epoch 15
Norm: 0.00, NNZs: 2, Bias: 1.032735, T: 60, Avg. loss: 0.125267, Objective: 0.626857
Total training time: 0.01 seconds.
-- Epoch 16
Norm: 0.09, NNZs: 2, Bias: 0.970037, T: 64, Avg. loss: 0.007608, Objective: 0.508299
Total training time: 0.01 seconds.
-- Epoch 17
Norm: 0.00, NNZs: 2, Bias: 1.029035, T: 68, Avg. loss: 0.110428, Objective: 0.611710
Total training time: 0.01 seconds.
-- Epoch 18
Norm: 0.08, NNZs: 2, Bias: 0.973325, T: 72, Avg. loss: 0.006762, Objective: 0.507350
Total training time: 0.01 seconds.
-- Epoch 19
Norm: 0.05, NNZs: 2, Bias: 1.025407, T: 76, Avg. loss: 0.079460, Objective: 0.573158
Total training time: 0.01 seconds.
-- Epoch 20
Norm: 0.05, NNZs: 2, Bias: 0.975284, T: 80, Avg. loss: 0.018781, Objective: 0.519118
Total training time: 0.01 seconds.
-- Epoch 21
Norm: 0.05, NNZs: 2, Bias: 1.023014, T: 84, Avg. loss: 0.078126, Objective: 0.578269
Total training time: 0.01 seconds.
Convergence after 21 epochs took 0.02 seconds

Takes more epochs, but because too few samples per epoch, the iteration after which the optimization stops is pretty random.

With tol=None, it converges:

-- Epoch 1
Norm: 0.97, NNZs: 2, Bias: 2.158447, T: 4, Avg. loss: 1.094272, Objective: 2.118822
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 0.00, NNZs: 2, Bias: 1.633124, T: 8, Avg. loss: 0.102961, Objective: 1.103998
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 0.00, NNZs: 2, Bias: 0.978807, T: 12, Avg. loss: 0.000000, Objective: 0.685541
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 0.00, NNZs: 2, Bias: 0.993995, T: 16, Avg. loss: 0.134821, Objective: 0.666610
Total training time: 0.00 seconds.

...

-- Epoch 994
Norm: 0.00, NNZs: 2, Bias: 0.999215, T: 3976, Avg. loss: 0.000448, Objective: 0.500307
Total training time: 0.25 seconds.
-- Epoch 995
Norm: 0.00, NNZs: 2, Bias: 1.000221, T: 3980, Avg. loss: 0.001845, Objective: 0.501704
Total training time: 0.25 seconds.
-- Epoch 996
Norm: 0.00, NNZs: 2, Bias: 0.999216, T: 3984, Avg. loss: 0.000447, Objective: 0.500306
Total training time: 0.25 seconds.
-- Epoch 997
Norm: 0.00, NNZs: 2, Bias: 1.000220, T: 3988, Avg. loss: 0.001841, Objective: 0.501701
Total training time: 0.25 seconds.
-- Epoch 998
Norm: 0.00, NNZs: 2, Bias: 0.999217, T: 3992, Avg. loss: 0.000446, Objective: 0.500305
Total training time: 0.25 seconds.
-- Epoch 999
Norm: 0.00, NNZs: 2, Bias: 1.000219, T: 3996, Avg. loss: 0.001838, Objective: 0.501697
Total training time: 0.25 seconds.
-- Epoch 1000
Norm: 0.00, NNZs: 2, Bias: 0.999218, T: 4000, Avg. loss: 0.000445, Objective: 0.500305
Total training time: 0.25 seconds.

adrinjalali

Maybe @antoinebaker or @lorentzenchr could have a look.

This also needs a FIX changelog

sklearn/linear_model/_sgd_fast.pyx.tp

adrinjalali · 2025-08-04T10:17:00Z

sklearn/linear_model/_stochastic_gradient.py

@@ -2220,9 +2220,9 @@ class SGDOneClassSVM(OutlierMixin, BaseSGD):
    >>> import numpy as np
    >>> from sklearn import linear_model
    >>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
-    >>> clf = linear_model.SGDOneClassSVM(random_state=42)
+    >>> clf = linear_model.SGDOneClassSVM(random_state=42, tol=None)


I'm not sure why these changes are necessary

It's to make the doctest pass. Tiny sample size of 4 with SGD leads to large loss fluctuations each epoch(each epoch only 4 updates), so the no improvement in 5 epochs default stopping criteria is meaningless. It "converges" by luck, so setting tol=None fixes that. Too few samples/updates per epoch to get meaningful estimate of loss. See the comment above #31856 (comment).

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

glemaitre and others added 5 commits October 8, 2024 11:55

FIX use objective instead of loss for convergence in SGD

b280798

remove debug

5164376

add more info regarding the objective or validation loss

8c38943

BUG: fix termination criterion of SGD, use objective instead of loss

9af0140

added tests

af0270b

github-actions bot added cython module:linear_model module:utils labels Jul 30, 2025

kostayScr added 3 commits July 30, 2025 21:30

fixed code formatting

ff65305

Merge branch 'main' into is/30027

af9f2c1

fixed code formatting, number 2

611e7cd

kostayScr force-pushed the is/30027 branch from 2be3532 to 611e7cd Compare July 30, 2025 19:26

fixes test, so that SGD converges, by setting tol=None

10aa131

fix doctest for SGDOneClassSVM

66e133a

kostayScr marked this pull request as draft July 31, 2025 10:54

kostayScr added 3 commits July 31, 2025 16:49

fixed L1 norm accumulation in WeightVector

3c9032f

fallback to loss for PA1/PA2; respect penalty_type

6099adf

remove debug

4aea4a3

kostayScr changed the title ~~Fix SGD convergence criteria~~ BUG Fix SGD convergence criteria Jul 31, 2025

kostayScr changed the title ~~BUG Fix SGD convergence criteria~~ BUG: Fix SGD convergence criteria Jul 31, 2025

kostayScr marked this pull request as ready for review July 31, 2025 16:22

kostayScr changed the title ~~BUG: Fix SGD convergence criteria~~ BUG: Fix SGD models(SGDRegressor etc.) convergence criteria Aug 3, 2025

adrinjalali reviewed Aug 4, 2025

View reviewed changes

fix typo

4ac8494

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

lorentzenchr mentioned this pull request Aug 12, 2025

DEP PassiveAggressiveClassifier and PassiveAggressiveRegressor #29097

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

Uh oh!

kostayScr commented Jul 30, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 30, 2025 •

edited

Loading

Uh oh!

kostayScr commented Jul 31, 2025 •

edited

Loading

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

adrinjalali Aug 4, 2025

Uh oh!

kostayScr Aug 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

Are you sure you want to change the base?

BUG: Fix SGD models(SGDRegressor etc.) convergence criteria #31856

Uh oh!

Conversation

kostayScr commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

kostayScr commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adrinjalali Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

kostayScr Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kostayScr commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

kostayScr commented Jul 31, 2025 •

edited

Loading

kostayScr Aug 4, 2025 •

edited

Loading