MAINT make `AdditiveChi2Sampler` stateless and check that stateless `Transformers` don't raise `NotFittedError` #25190

Vincent-Maladiere · 2022-12-14T14:33:24Z

Reference Issues/PRs

Addresses the AdditiveChi2Sampler issue of the drafting meeting 12/2 and #12616

What does this implement/fix? Explain your changes.

Make AdditiveChi2Sampler stateless by:
- removing the NotFittedError in the transform method
- moving the computation logic of sample_interval from fit to transform
- not storing sample_interval_ anymore
Edit the tests to reflect that change

Any other comments?

Todo next: create a subsequent PR to add a common test to check that stateless estimators don't raise NotFittedError when transform is called without a prior call to fit.
cc @glemaitre

sklearn/kernel_approximation.py

glemaitre

A couple of remarks.

sklearn/kernel_approximation.py

sklearn/tests/test_kernel_approximation.py

sklearn/kernel_approximation.py

sklearn/tests/test_kernel_approximation.py

.ipynb_checkpoints/Untitled-checkpoint.ipynb

Untitled.ipynb

sklearn/kernel_approximation.py

sklearn/tests/test_kernel_approximation.py

Vincent-Maladiere · 2022-12-28T11:20:47Z

@glemaitre, do we need to edit the Changelog for this one?

glemaitre · 2022-12-28T11:26:21Z

Yes, we need to document the deprecation to the end-user ;)

Vincent-Maladiere · 2022-12-28T11:30:16Z

And shall we add the common test for stateless estimators in another PR or in this one?

glemaitre · 2022-12-28T11:38:48Z

The common test is already prepared there: #25223

Awaiting for a second approval to be merged.

Vincent-Maladiere · 2022-12-28T11:56:41Z

Thanks for pointing this out. Interestingly, this PR doesn't leverage the stateless flag but uses get_feature_names_out as a proxy if I understand it correctly.

glemaitre · 2022-12-28T13:20:22Z

Oh right, my bad. I was confused about the end goal of the PR. We don't have a common test yet for the stateless part.

We can make the common test here and add a whitelist also for the estimator that needs to be fixed. Sorry about that.

glemaitre · 2023-01-03T09:33:26Z

sklearn/tests/test_common.py

+@pytest.mark.parametrize(
+    "estimator", STATELESS_ESTIMATORS, ids=_get_check_estimator_ids
+)
+def test_no_fitted_error_stateless_estimator(estimator):


I would write this test differently. It should be a check in the estimator check and should be added in _yield_all with a if "stateless" is tags (or something equivalent).

glemaitre · 2023-01-03T09:34:01Z

sklearn/tests/test_common.py

+    X, _ = make_blobs(n_samples=80, n_features=4, random_state=0)
+    X = X[X > 0].reshape(-1, 1)


you can check in the other check, we usually call some helper to transform X and y depending on the expectation of the estimator.

glemaitre · 2023-01-03T09:35:55Z

sklearn/tests/test_common.py

+    if hasattr(estimator, "transform"):
+        # Does not raise
+        estimator.transform(X)


It is true that we expect a transformer now. So either we add it to the _yield for transformer or we think that estimator like DummyClassifier or DummyRegressor could become somehow stateless (with some parameter combination) in which case we want to try both transform and predict.

I think that we want to check the length of the transform to check that we still have n_samples.

Thanks for these detailed explanations. Regarding your option number 2, wouldn't adding predict raise code coverage concerns since this line won't be tested?

I am fine with keeping it for transformers for the moment. If we tackle "predictors" then we will make the change at this moment.

doc/whats_new/v1.3.rst

glemaitre · 2023-01-04T11:01:30Z

sklearn/utils/estimator_checks.py

+    X = _enforce_estimator_tags_X(transformer, X)
+
+    transformer = clone(transformer)
+    # Should not raise


I think that you can safely remove this comment :)

glemaitre · 2023-01-04T11:01:47Z

sklearn/utils/estimator_checks.py

+@ignore_warnings(category=FutureWarning)
+def check_transformers_unfitted_stateless(name, transformer):
+    """Check that using transform or predict without prior fitting
+    doesn't raise a NotFittedError.


Suggested change

doesn't raise a NotFittedError.

doesn't raise a NotFittedError for stateless transformers.

glemaitre

Otherwise LGTM.

jjerphan

Thanks for extending tests cases to check compliant behavior of stateless estimator and making AdditiveChi2Sampler stateless, @Vincent-Maladiere.

Here is a first review.

sklearn/kernel_approximation.py

doc/whats_new/v1.3.rst

sklearn/utils/estimator_checks.py

sklearn/kernel_approximation.py

sklearn/tests/test_kernel_approximation.py

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

jjerphan

LGTM. Thank you, @Vincent-Maladiere.

doc/whats_new/v1.3.rst

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

…Transformers` don't raise `NotFittedError` (scikit-learn#25190) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

MAINT make AdditiveChi2Sampler stateless

de5aeb8

glemaitre reviewed Dec 14, 2022

View reviewed changes

sklearn/kernel_approximation.py Show resolved Hide resolved

glemaitre reviewed Dec 14, 2022

View reviewed changes

apply feedbacks

f0d7a57

glemaitre reviewed Dec 15, 2022

View reviewed changes

sklearn/kernel_approximation.py Outdated Show resolved Hide resolved

glemaitre reviewed Dec 15, 2022

View reviewed changes

sklearn/kernel_approximation.py Outdated Show resolved Hide resolved

glemaitre reviewed Dec 15, 2022

View reviewed changes

Vincent-Maladiere added 4 commits December 15, 2022 17:28

fix tests

9a85c32

apply new suggestions

58f2391

typo

efb5fa5

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

d5a4fed

glemaitre reviewed Dec 16, 2022

View reviewed changes

Vincent-Maladiere added 4 commits December 22, 2022 19:32

iterate on tests

28d334c

typo in docstring

cc00fce

fix docstring

71a7b9f

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

26ee464

update changelog

d3a36e2

Vincent-Maladiere added 5 commits December 28, 2022 17:29

improve test coverage

9eb30b4

add test_common

c6ef260

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

5d84bf3

improve coverage by removing from test_common

74e3998

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

5895114

Vincent-Maladiere requested a review from glemaitre January 3, 2023 09:36

glemaitre reviewed Jan 3, 2023

View reviewed changes

Vincent-Maladiere added 2 commits January 3, 2023 14:51

move stateless check into _yield_transformer_checks

628bfc2

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

3823e3a

glemaitre reviewed Jan 4, 2023

View reviewed changes

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved

glemaitre reviewed Jan 4, 2023

View reviewed changes

glemaitre approved these changes Jan 4, 2023

View reviewed changes

Vincent-Maladiere added 3 commits January 4, 2023 22:26

apply suggestions

19c34bb

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

ea81616

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

2149b26

jjerphan reviewed Jan 12, 2023

View reviewed changes

Vincent-Maladiere and others added 2 commits January 17, 2023 13:57

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

b516a5c

Apply suggestions from code review

f4d0b0e

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Vincent-Maladiere changed the title ~~MAINT make AdditiveChi2Sampler stateless~~ MAINT make AdditiveChi2Sampler stateless and check that stateless Transformers don't raise NotFittedError Jan 17, 2023

apply suggestions

b8245fd

jjerphan approved these changes Jan 17, 2023

View reviewed changes

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved

Vincent-Maladiere and others added 2 commits January 17, 2023 16:13

Update doc/whats_new/v1.3.rst

88fdd7e

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

1234066

jjerphan enabled auto-merge (squash) January 18, 2023 16:24

Merge branch 'main' into make_AdditiveChi2Sampler_stateless

ea18e0b

jjerphan merged commit e010e4f into scikit-learn:main Jan 18, 2023

jjerphan added a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023

MAINT make AdditiveChi2Sampler stateless and check that stateless `…

8ae8646

…Transformers` don't raise `NotFittedError` (scikit-learn#25190) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

jjerphan added a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023

MAINT make AdditiveChi2Sampler stateless and check that stateless `…

eab63b7

…Transformers` don't raise `NotFittedError` (scikit-learn#25190) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

glemaitre mentioned this pull request Jan 31, 2023

[RFC] Stateless transformers requiring fit? #12616

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT make `AdditiveChi2Sampler` stateless and check that stateless `Transformers` don't raise `NotFittedError` #25190

MAINT make `AdditiveChi2Sampler` stateless and check that stateless `Transformers` don't raise `NotFittedError` #25190

Vincent-Maladiere commented Dec 14, 2022

glemaitre left a comment

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

glemaitre Jan 3, 2023

glemaitre Jan 3, 2023

glemaitre Jan 3, 2023

glemaitre Jan 3, 2023

Vincent-Maladiere Jan 3, 2023

glemaitre Jan 4, 2023

glemaitre Jan 4, 2023

glemaitre Jan 4, 2023

glemaitre left a comment

jjerphan left a comment •

edited

Loading

jjerphan left a comment

		X, _ = make_blobs(n_samples=80, n_features=4, random_state=0)
		X = X[X > 0].reshape(-1, 1)

	doesn't raise a NotFittedError.
	doesn't raise a NotFittedError for stateless transformers.

MAINT make AdditiveChi2Sampler stateless and check that stateless Transformers don't raise NotFittedError #25190

MAINT make AdditiveChi2Sampler stateless and check that stateless Transformers don't raise NotFittedError #25190

Conversation

Vincent-Maladiere commented Dec 14, 2022

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

glemaitre left a comment

Choose a reason for hiding this comment

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

Vincent-Maladiere commented Dec 28, 2022

glemaitre commented Dec 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

jjerphan left a comment • edited Loading

Choose a reason for hiding this comment

jjerphan left a comment

Choose a reason for hiding this comment

MAINT make `AdditiveChi2Sampler` stateless and check that stateless `Transformers` don't raise `NotFittedError` #25190

MAINT make `AdditiveChi2Sampler` stateless and check that stateless `Transformers` don't raise `NotFittedError` #25190

jjerphan left a comment •

edited

Loading