ENH replace Cython loss functions in SGD part 2 #28029

lorentzenchr · 2023-12-27T22:19:55Z

Reference Issues/PRs

Follow-up of #27999 (which needs to be merged first). Partly addresses #15123.

What does this implement/fix? Explain your changes.

This PR replaces the Cython loss functions of SGD and SAGA with the ones from _loss (SquaredLoss, Huber, LogLoss) and inherits from _loss._loss.CyLossFunction for the remaining ones (Hinge, ..., and Multinomial).

Also, the loss functions form sklearn.linear_model.__init__ are removed.

Any other comments?

Only merge after release 1.5, i.e. this PR is to be released with v1.6.

mainly name changes: - loss(..) -> cy_loss(..) - dloss(..) -> cy_gradient(..)

github-actions · 2023-12-27T22:21:10Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 0dd76c3. Link to the linter CI: here}

lorentzenchr · 2023-12-30T22:49:55Z

asv benchmarks

asv compare b8d783d2 68f35c3e                   

All benchmarks:

| Change | Before [b8d783d2] <main> | After [68f35c3e] <replace_sgd_with_common_loss_part_2~1> | Ratio | Benchmark (Parameter)                                                            |
|--------|--------------------------|----------------------------------------------------------|-------|----------------------------------------------------------------------------------|
|        | 109M                     | 109M                                                     |  1    | linear_model.LogisticRegressionBenchmark.peakmem_fit('dense', 'lbfgs', 1)        |
|        | 90.1M                    | 90.1M                                                    |  1    | linear_model.LogisticRegressionBenchmark.peakmem_fit('dense', 'saga', 1)         |
|        | 385M                     | 384M                                                     |  1    | linear_model.LogisticRegressionBenchmark.peakmem_fit('sparse', 'lbfgs', 1)       |
|        | 110M                     | 110M                                                     |  1    | linear_model.LogisticRegressionBenchmark.peakmem_fit('sparse', 'saga', 1)        |
|        | 108M                     | 108M                                                     |  1    | linear_model.LogisticRegressionBenchmark.peakmem_predict('dense', 'lbfgs', 1)    |
|        | 94.3M                    | 94.6M                                                    |  1    | linear_model.LogisticRegressionBenchmark.peakmem_predict('dense', 'saga', 1)     |
|        | 108M                     | 109M                                                     |  1    | linear_model.LogisticRegressionBenchmark.peakmem_predict('sparse', 'lbfgs', 1)   |
|        | 96.5M                    | 96.7M                                                    |  1    | linear_model.LogisticRegressionBenchmark.peakmem_predict('sparse', 'saga', 1)    |
| +      | 15.2±2ms                 | 17.7±2ms                                                 |  1.16 | linear_model.LogisticRegressionBenchmark.time_fit('dense', 'lbfgs', 1)           |
|        | 1.97±0.04s               | 1.93±0.07s                                               |  0.98 | linear_model.LogisticRegressionBenchmark.time_fit('dense', 'saga', 1)            |
|        | 963±20ms                 | 963±20ms                                                 |  1    | linear_model.LogisticRegressionBenchmark.time_fit('sparse', 'lbfgs', 1)          |
| +      | 1.87±0.01s               | 2.12±0.06s                                               |  1.14 | linear_model.LogisticRegressionBenchmark.time_fit('sparse', 'saga', 1)           |
|        | 3.00±0.5ms               | 3.05±0.5ms                                               |  1.02 | linear_model.LogisticRegressionBenchmark.time_predict('dense', 'lbfgs', 1)       |
|        | 1.42±0.01ms              | 1.44±0.02ms                                              |  1.01 | linear_model.LogisticRegressionBenchmark.time_predict('dense', 'saga', 1)        |
|        | 6.24±0.04ms              | 6.31±0.09ms                                              |  1.01 | linear_model.LogisticRegressionBenchmark.time_predict('sparse', 'lbfgs', 1)      |
|        | 4.81±0.02ms              | 4.95±0.08ms                                              |  1.03 | linear_model.LogisticRegressionBenchmark.time_predict('sparse', 'saga', 1)       |
|        | 0.17488127353035546      | 0.17488127353035546                                      |  1    | linear_model.LogisticRegressionBenchmark.track_test_score('dense', 'lbfgs', 1)   |
|        | 0.7789925817802057       | 0.7789925817802057                                       |  1    | linear_model.LogisticRegressionBenchmark.track_test_score('dense', 'saga', 1)    |
|        | 0.06538461538461539      | 0.06538461538461539                                      |  1    | linear_model.LogisticRegressionBenchmark.track_test_score('sparse', 'lbfgs', 1)  |
|        | 0.5765140080078162       | 0.5765140080078162                                       |  1    | linear_model.LogisticRegressionBenchmark.track_test_score('sparse', 'saga', 1)   |
|        | 0.17920161231776         | 0.17920161231776                                         |  1    | linear_model.LogisticRegressionBenchmark.track_train_score('dense', 'lbfgs', 1)  |
|        | 0.7998934724948512       | 0.7998934724948512                                       |  1    | linear_model.LogisticRegressionBenchmark.track_train_score('dense', 'saga', 1)   |
|        | 0.0681998556998557       | 0.0681998556998557                                       |  1    | linear_model.LogisticRegressionBenchmark.track_train_score('sparse', 'lbfgs', 1) |
|        | 0.6908414295256007       | 0.6908414295256007                                       |  1    | linear_model.LogisticRegressionBenchmark.track_train_score('sparse', 'saga', 1)  |
|        | 165M                     | 165M                                                     |  1    | linear_model.SGDRegressorBenchmark.peakmem_fit('dense')                          |
|        | 93.8M                    | 93.7M                                                    |  1    | linear_model.SGDRegressorBenchmark.peakmem_fit('sparse')                         |
|        | 166M                     | 166M                                                     |  1    | linear_model.SGDRegressorBenchmark.peakmem_predict('dense')                      |
|        | 93.7M                    | 93.7M                                                    |  1    | linear_model.SGDRegressorBenchmark.peakmem_predict('sparse')                     |
|        | 4.50±0.02s               | 4.40±0.02s                                               |  0.98 | linear_model.SGDRegressorBenchmark.time_fit('dense')                             |
|        | 3.46±0.02s               | 3.51±0.01s                                               |  1.01 | linear_model.SGDRegressorBenchmark.time_fit('sparse')                            |
|        | 7.77±0.4ms               | 8.08±0.1ms                                               |  1.04 | linear_model.SGDRegressorBenchmark.time_predict('dense')                         |
|        | 1.68±0.03ms              | 1.75±0.04ms                                              |  1.04 | linear_model.SGDRegressorBenchmark.time_predict('sparse')                        |
|        | 0.9636293915890342       | 0.9636293915890342                                       |  1    | linear_model.SGDRegressorBenchmark.track_test_score('dense')                     |
|        | 0.961311884809733        | 0.961311884809733                                        |  1    | linear_model.SGDRegressorBenchmark.track_test_score('sparse')                    |
|        | 0.9641785427112692       | 0.9641785427112692                                       |  1    | linear_model.SGDRegressorBenchmark.track_train_score('dense')                    |
|        | 0.9621441831314548       | 0.9621441831314548                                       |  1    | linear_model.SGDRegressorBenchmark.track_train_score('sparse')                   |

lorentzenchr · 2024-01-02T16:37:01Z

The failing tests seem to be caused by #28046.

ogrisel · 2024-01-10T09:02:23Z

There are conflict to resolve before being able to confirm if the merge of #28046 makes the tests pass.

OmarManzoor · 2024-07-18T09:40:34Z

@lorentzenchr would it be okay if I work on these PRs to take them forward?

OmarManzoor

LGTM. Thank you @lorentzenchr
The tests are passing and the benchmarks show no regression.

jjerphan · 2024-07-23T18:20:15Z

sklearn/linear_model/_sgd_fast.pxd

@@ -1,26 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause


Let's assume no other external project was using those functions.

sklearn/linear_model/_stochastic_gradient.py

OmarManzoor · 2024-07-24T05:57:02Z

Thanks for the review Julien. I fixed the typo. I'll let you merge after having another look.

Co-authored-by: Omar Salman <omar.salman@arbisoft.com>

lorentzenchr added 6 commits December 21, 2023 19:27

MNT replace Cython loss functions in SGD part 1

f6c3bda

MNT change argument order in SAG

df4bdca

MNT inherit from CyLossFunction

7ddbcd5

mainly name changes: - loss(..) -> cy_loss(..) - dloss(..) -> cy_gradient(..)

ENH replace Log, SquaredLoss and Huber with common losses

bc2be3c

MNT remove SGD extension type LossFunction

68f35c3

MNT remove Hing and ModifiedHuber from __init__

aefb44f

lorentzenchr added this to the 1.6 milestone Dec 27, 2023

github-actions bot added module:linear_model cython labels Dec 27, 2023

lorentzenchr mentioned this pull request Dec 31, 2023

MNT replace Cython loss functions in SGD part 3 #28037

Merged

lorentzenchr mentioned this pull request Jan 2, 2024

Plan for SGD and SAGA loss function migration #28049

Closed

lorentzenchr added 3 commits January 10, 2024 15:43

Merge branch 'main' into replace_sgd_with_common_loss_part_2

87f6a4a

MNT remove SGD extension type LossFunction

a9d1b8d

Merge branch 'main' into replace_sgd_with_common_loss_part_2

065be32

jjerphan added the No Changelog Needed label Apr 10, 2024

OmarManzoor added 2 commits July 19, 2024 12:15

Merge branch 'main' into replace_sgd_with_common_loss_part_2

d0ae864

Fix the meson build file

77e7c53

OmarManzoor approved these changes Jul 19, 2024

View reviewed changes

OmarManzoor added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jul 19, 2024

jjerphan approved these changes Jul 23, 2024

View reviewed changes

jjerphan removed the Waiting for Second Reviewer First reviewer is done, need a second one! label Jul 23, 2024

OmarManzoor added 2 commits July 24, 2024 10:49

Merge branch 'main' into replace_sgd_with_common_loss_part_2

2dd0705

Updates: PR suggestions

0dd76c3

jjerphan merged commit 936a391 into scikit-learn:main Jul 24, 2024
30 checks passed

lorentzenchr deleted the replace_sgd_with_common_loss_part_2 branch August 4, 2024 10:02

MarcBresson pushed a commit to MarcBresson/scikit-learn that referenced this pull request Sep 2, 2024

ENH replace Cython loss functions in SGD part 2 (scikit-learn#28029)

f8f16d4

Co-authored-by: Omar Salman <omar.salman@arbisoft.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH replace Cython loss functions in SGD part 2 #28029

ENH replace Cython loss functions in SGD part 2 #28029

lorentzenchr commented Dec 27, 2023

github-actions bot commented Dec 27, 2023 •

edited

Loading

lorentzenchr commented Dec 30, 2023

lorentzenchr commented Jan 2, 2024

ogrisel commented Jan 10, 2024

OmarManzoor commented Jul 18, 2024 •

edited

Loading

OmarManzoor left a comment •

edited

Loading

jjerphan Jul 23, 2024

OmarManzoor commented Jul 24, 2024

ENH replace Cython loss functions in SGD part 2 #28029

ENH replace Cython loss functions in SGD part 2 #28029

Conversation

lorentzenchr commented Dec 27, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Dec 27, 2023 • edited Loading

✔️ Linting Passed

lorentzenchr commented Dec 30, 2023

lorentzenchr commented Jan 2, 2024

ogrisel commented Jan 10, 2024

OmarManzoor commented Jul 18, 2024 • edited Loading

OmarManzoor left a comment • edited Loading

Choose a reason for hiding this comment

jjerphan Jul 23, 2024

Choose a reason for hiding this comment

OmarManzoor commented Jul 24, 2024

github-actions bot commented Dec 27, 2023 •

edited

Loading

OmarManzoor commented Jul 18, 2024 •

edited

Loading

OmarManzoor left a comment •

edited

Loading