DEP PassiveAggressiveClassifier and PassiveAggressiveRegressor #29097

lorentzenchr · 2024-05-24T08:52:32Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR deprecates the 2 classes PassiveAggressiveClassifier and PassiveAggressiveRegressor that were introduces in #1259. A user can easily use SGDClassifier and SGDRegressor instead and I cannot figure out the added value of these 2 classes.

@scikit-learn/core-devs ping for decision (in particular in case you are against the deprecation)

github-actions · 2024-05-24T08:53:53Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: b2144ee. Link to the linter CI: here}

adrinjalali · 2024-05-24T09:12:03Z

I'm +1 on this:

I haven't seen anybody use it (except maybe @amueller ?)
If we deprecate and people start complaining, we can revert deprecation. So in a sense deprecation is a way to check usage here
The deprecation message clearly should always point to an alternative, which I think is the case here.

glemaitre · 2024-05-24T09:21:38Z

The deprecation message clearly should always point to an alternative, which I think is the case here.

I think that we should do a bit more and specify how you get a similar behaviour by setting the right parameters.

lorentzenchr · 2024-05-25T08:36:35Z

In fa3d842, I added C to SGDClassifier and SGDRegressor to make it more accessible.
I could imagine to put the equivalent PA estimators into the SGD docstring - maybe once the deprecation is carried out.

adrinjalali · 2024-05-25T11:02:30Z

Nice. We need to add more information on the docstring about C though. It's very short and not clear what it does when I read it. Otherwise LGTM.

adrinjalali · 2024-10-22T13:31:25Z

@lorentzenchr getting close to the release, would you mind resolving issues here?

lorentzenchr · 2024-10-22T14:47:16Z

@lorentzenchr getting close to the release, would you mind resolving issues here?

I can resolve merge conflicts, but I won't do

We need to add more information on the docstring about C though.

adrinjalali · 2024-10-22T14:51:18Z

sklearn/linear_model/_passive_aggressive.py

+            SGDRegressor(
+                penalty=None,
+                alpha=1.0,
+                C=1.0


I must be missing something, our SGDRegressor doesn't have a C param.

BaseSGDClassifier has an attibute C. This PR exposes this attribute to SGDClassifier (same for regressors).

I don't really like that. Proposition: I revert the exposition of C and write in the deprecation note, using somewhat private API:

clf = SGDClassifier( penalty=None, alpha=1.0, eta0=1.0, learning_rate="pa1", loss="hinge", ) clf.C = 1.0 # THIS IS USING PRIVATE API

See c24c95f. I can revert if you dislike it.

If we're deprecating these estimators and telling users to use SGDRegressor and SGDClassifier instead, it makes sense to actually have a public way of doing that. So I'm in favor of exposing C in both SGD estimators, document them, and then here simply create and equivalent estimator using them.

How about leaving them private for the time being?

I would like to have a dedicated discussion whether to

expose C in SGD; xor

remove C and all passive aggressive things altogether

I think this PR is the right place to have that discussion. I'd like others to weigh in about this point.

maybe @ogrisel @amueller @GaelVaroquaux @adam2392 would have an opinion?

I'm not super familiar with passive aggressive vs SGD but a loose read suggests passive aggressive allows more online updating. Though i thought SGD easily allows online training too, so why would someone use one over the other?

Is it true that the only parameter that makes these different is "C" step size parameter?

Pragmatically: if no one uses "C", then i kinda agree might be simplest to just remove it? If there's something I'm missing tho, lmk.

PassiveAggressiveClassifier was first implemented in #1259 in 2012. I have never seen it used anywhere and I think our SGD... offer enough for online learning (partial_fit). So I would be fine with removing it completely, i.e. also removing C in SGD. Unless someone steps in for keeping it.

Having read through the code and partially papers, I could also live with making learning_rate="pa1" and "pa2" as well as C public in SGD.... In that case I would rename C to something like pa_C.

I am partially in favor of just simplification and deprecating those parameters / the PAClf altogether, unless someone has an actual use case for them.

lorentzenchr · 2024-10-22T18:43:53Z

The decorator deprecated interferes with the signature, e.g. in check_do_not_raise_errors_in_init_or_set_params:

    params = signature(Estimator).parameters

I need help how to avoid that (I won't fix deprecated) or rather to (temporarily) switch of those tests for PassiveAggressiveXX.

lorentzenchr · 2024-10-24T20:35:44Z

Note that 306a5fa should be made a PR of its own in case this PR is not merged.

Edit: I opened #30145.

lorentzenchr · 2025-07-31T11:53:33Z

In both our pytest and doc build, we can set the warning filter for this specific warning to be ignored. That should fix the CI

Where can I filter warnings for the doc build?

adrinjalali · 2025-07-31T12:14:11Z

scikit-learn/doc/conf.py

Lines 861 to 868 in ae9d088

    
           warnings.filterwarnings( 
        
               "ignore", 
        
               category=UserWarning, 
        
               message=( 
        
                   "Matplotlib is currently using agg, which is a" 
        
                   " non-GUI backend, so cannot show the figure." 
        
               ), 
        
           )

lorentzenchr · 2025-07-31T14:21:32Z

While the CI failure of Ubuntu_Jammy_Jellyfish pymin_conda_forge_openblas_ubuntu_2204 is clearer because it sets "SKLEARN_WARNINGS_AS_ERRORS": "1" (not 100% clear though which option supersedes which), the failure of ci/circleci: doc is not clear to me. I added the filterwarning but it still errors. It does not print where and I can't reproduce locally => I need help with this.

…ter removal

lesteve · 2025-07-31T14:57:10Z

Maybe add your warning to

scikit-learn/sklearn/utils/_testing.py

Lines 1435 to 1441 in 3d35e02

    
           # Plotly deprecated something which we're not using, but internally it's used 
        
           # and needs to be fixed on their side. 
        
           # https://github.com/plotly/plotly.py/issues/4997 
        
           WarningInfo( 
        
               "ignore", 
        
               message=".+scattermapbox.+deprecated.+scattermap.+instead", 
        
               category=DeprecationWarning,

This thing is used to centralize the warnings that we can not do much about both for the doc build and the tests.

lorentzenchr · 2025-08-01T12:06:22Z

Over the time with this PR, I become some sort of expert for the Passive-Aggressive algos with two insights:

Interestingly, PA can be generalized for all loss functions $\ell$ with minimum zero not just Hinge and epsilon insensitive loss (was unclear before, see WIP: Adding Passive Aggressive learning rates #1259 (comment)).
It can be formulated as a normal SGD scheme ($\tau$ in the Online Passive-Aggressive Algorithms becomes $\eta$ in SGD). This way it becomes clear that it actually minimizes a loss (unclear before, see WIP: Adding Passive Aggressive learning rates #1259 (comment) and WIP: Adding Passive Aggressive learning rates #1259 (comment))

Let $\ell(x\cdot w, y)$ be the loss for a single sample vector $x$ with weights $w$ and observed class $y$ with $\min \ell(x\cdot w, y) = 0$.

The main trick is to uses a linear approximation of the loss as $\ell(x\cdot w, y) = \ell(x\cdot w_t, y) + (w - w_t)\cdot x \ell^\prime(x\cdot w_t, y)$ (derivative w.r.t. to its first argument), which seems appropriate for the purpose of SGD.
Note that $\frac{d}{dw}\ell(x\cdot w, y) = x \ell^\prime(x\cdot w_t, y)$.

PA-I

(Eq. 6) Update current weights $w_t$ as $w_{t+1} = \mathrm{argmin}_w \frac{1}{2}\lVert w - w_t\rVert^2 + C\xi$ such that $\ell(x\cdot w, y)\leq\xi$ and $0\leq\xi$. C is a free positive parameter controlling the step size.

Solution: $w_{t+1} = w_t - \eta x \ell^\prime(x\cdot w_t, y)$ with $\eta=\min(C, \frac{\ell(x\cdot w_t, y)}{\lVert x\rVert^2\ell^\prime(x\cdot w_t, y)^2})$.

For Hinge loss with $y=\pm 1$, $\ell^{\prime 2} = y^2=1$ one recovers $\eta=\min(C, \frac{\ell(x\cdot w_t, y)}{\lVert x\rVert^2})$ of the paper (there called $\tau$) and update $w_{t+1} = w_t + \eta x y$.

PA-II

(Eq. 7) Update current weights $w_t$ as $w_{t+1} = \mathrm{argmin}_w \frac{1}{2}\lVert w - w_t\rVert^2 + C\xi^2$ such that $\ell(x\cdot w, y)\leq\xi$.

Solution: $w_{t+1} = w_t - \eta x \ell^\prime(x\cdot w_t, y)$ with $\eta=\frac{\ell(x\cdot w_t, y)}{\lVert x\rVert^2\ell^\prime(x\cdot w_t, y)^2 + \frac{1}{2C}}$.

I might consider a follow-up PR which will generalize, but actually simplify the SGD code.

adrinjalali

This looks pretty awesome!

lorentzenchr · 2025-08-05T10:27:23Z

Maybe for another PR: We could think of avoiding to introduce the new parameter PA_C and repurpose an existing parameter, e.g., eta0.

lorentzenchr · 2025-08-12T06:06:27Z

If any of the participating possible reviewers could give a 2nd approval, that would be great. I really would like to have this over the finish line before possible merge conflicts (for example #31856).

When it's merged, I might open another PR to use eta0 instead of PA_C. Then we have time until the 1.8 release. But I only have time until end of August.

adrinjalali · 2025-08-12T08:42:32Z

@adam2392 or @OmarManzoor could maybe have a look?

OmarManzoor

Thank you for the PR @lorentzenchr

sklearn/linear_model/_stochastic_gradient.py

OmarManzoor

LGTM. Thank you @lorentzenchr

lorentzenchr · 2025-08-12T13:37:57Z

@OmarManzoor You spotted an error in the docstring of pa2. Should be corrected with
b2144ee. Could you (auto-) merge again?

DEP PassiveAggressiveClassifier and PassiveAggressiveRegressor

d26659a

github-actions bot added the module:linear_model label May 24, 2024

lorentzenchr added 3 commits May 25, 2024 10:33

ENH add C to SGD init

fa3d842

DOC equivvalent estimator

7bdf688

MNT redundant parameter validation

0111071

lorentzenchr added 3 commits May 25, 2024 10:38

DOC whatsnew 1.6

3051404

Merge branch 'main' into dep_passive_aggressive

3af1f2f

MNT after merging main

4f23bc0

lorentzenchr added this to the 1.6 milestone Jun 17, 2024

adrinjalali reviewed Oct 22, 2024

View reviewed changes

lorentzenchr added 3 commits October 22, 2024 19:20

Merge branch 'main' into dep_passive_aggressive

eb1c83e

DOC add new whatsnew

df3825e

CLN make C private again

c24c95f

lorentzenchr added 2 commits October 23, 2024 21:25

FIX signature of deprecated classes

306a5fa

FIX _parameter_constraints in SGD classes

c367698

lorentzenchr force-pushed the dep_passive_aggressive branch from 7b21cfb to c367698 Compare October 23, 2024 21:20

lorentzenchr added 2 commits October 24, 2024 22:32

FIX numpydoc GL09 placement of deprecation in docstring

67f6404

TST remove PA von other tests

246cd04

lorentzenchr mentioned this pull request Oct 24, 2024

FIX signature of deprecated classes #30145

Merged

FIX tests by using learning_rate="pa1"

05b7bbd

glemaitre removed this from the 1.6 milestone Nov 25, 2024

DOC add filterwarnings to doc/conf.py

64fde84

lorentzenchr force-pushed the dep_passive_aggressive branch from a9e758b to 64fde84 Compare July 31, 2025 14:06

lorentzenchr added 3 commits July 31, 2025 16:46

DOC fix typo in docstring

16da746

DOC/MNT remove/replace PA from user guide and place TODO(1.10) for la…

4e69d78

…ter removal

Merge branch 'main' into dep_passive_aggressive

06e40e5

lorentzenchr added 4 commits July 31, 2025 17:25

MNT add PassiveAggressive in _get_warnings_filters_info_list

66280c7

DOC fix typo in user guide

1e70f8c

DOC fix docstring of _plain_sgd

d0290fc

MNT remove filterwarnings from pyproject.toml

08ca849

adrinjalali approved these changes Aug 5, 2025

View reviewed changes

lorentzenchr added the Waiting for Second Reviewer First reviewer is done, need a second one! label Aug 11, 2025

OmarManzoor reviewed Aug 12, 2025

View reviewed changes

sklearn/linear_model/_stochastic_gradient.py Outdated Show resolved Hide resolved

sklearn/linear_model/_stochastic_gradient.py Outdated Show resolved Hide resolved

OmarManzoor approved these changes Aug 12, 2025

View reviewed changes

Merge branch 'main' into dep_passive_aggressive

414fa30

OmarManzoor enabled auto-merge (squash) August 12, 2025 13:23

lorentzenchr disabled auto-merge August 12, 2025 13:28

DOC correct docstring about available loss for pa2

b2144ee

OmarManzoor enabled auto-merge (squash) August 12, 2025 13:38

OmarManzoor merged commit 3c74809 into scikit-learn:main Aug 12, 2025
36 checks passed

lorentzenchr deleted the dep_passive_aggressive branch August 12, 2025 15:06

lorentzenchr mentioned this pull request Aug 12, 2025

MNT remove PA_C from SGD and (re-) use eta0 #31932

Open

Uh oh!

DEP PassiveAggressiveClassifier and PassiveAggressiveRegressor #29097

DEP PassiveAggressiveClassifier and PassiveAggressiveRegressor #29097

Conversation

lorentzenchr commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

github-actions bot commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

adrinjalali commented May 24, 2024

Uh oh!

glemaitre commented May 24, 2024

Uh oh!

lorentzenchr commented May 25, 2024

Uh oh!

adrinjalali commented May 25, 2024

Uh oh!

adrinjalali commented Oct 22, 2024

Uh oh!

lorentzenchr commented Oct 22, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Oct 22, 2024

Uh oh!

lorentzenchr commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Jul 31, 2025

Uh oh!

adrinjalali commented Jul 31, 2025

Uh oh!

lorentzenchr commented Jul 31, 2025

Uh oh!

lesteve commented Jul 31, 2025

Uh oh!

lorentzenchr commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PA-I

PA-II

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Aug 12, 2025

Uh oh!

adrinjalali commented Aug 12, 2025

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lorentzenchr commented May 24, 2024 •

edited

Loading

github-actions bot commented May 24, 2024 •

edited

Loading

lorentzenchr Oct 24, 2024 •

edited

Loading

lorentzenchr commented Oct 24, 2024 •

edited

Loading

lorentzenchr commented Aug 1, 2025 •

edited

Loading

lorentzenchr commented Aug 5, 2025 •

edited

Loading