-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
FIX Run common tests on SparseCoder #32077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX Run common tests on SparseCoder #32077
Conversation
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
@property | ||
def n_features_in_(self): | ||
"""Number of features seen during `fit`.""" | ||
return self.dictionary.shape[1] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: With this modification, n_features_in_
is now only set if fit
is called. It makes this estimator follow our API and in line with this discussion #27724 (comment)
…te fit docstring, refactor check_array parameters
…kit-learn into SparseCoder_common_tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of suggestions for the docstrings. Looks good otherwise. Please also add a changelog entry. You can say that SparseCoder now follows the transformer API of scikit-learn and that now parameter and input validation is executed in fit (you can follow these instructions to write a changelog fragment https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/upcoming_changes/README.md)
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
doc/whats_new/upcoming_changes/sklearn.decomposition/32077.enhancement.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks @FrancoisPgm
maybe @adrinjalali for a second review since you reviewed the previous attempt #27724. |
…ancement.rst Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
@@ -711,6 +716,38 @@ | |||
], | |||
}, | |||
SkewedChi2Sampler: {"check_dict_unchanged": dict(n_components=1)}, | |||
SparseCoder: { | |||
"check_estimators_dtypes": dict(dictionary=rng.normal(size=(5, 5))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a single case where we can set to have it pass all the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dictionary
is not a very friendly parameter because it needs to have a shape compatible with X
, but all the checks have different X
s
sklearn/tests/test_common.py
Outdated
@@ -356,6 +360,7 @@ def test_set_output_transform(estimator): | |||
f"Skipping check_set_output_transform for {name}: Does not support" | |||
" set_output API" | |||
) | |||
estimator = next(_yield_instances_for_check(check_set_output_transform, estimator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to be a nice catch, but I think we should then go through all instances for this test, instead of only the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, for the sparsecoder right now it's always one instance but it can be more. I'll add a for loop to go through all instances.
@@ -0,0 +1,2 @@ | |||
- :class:`decomposition.SparseCoder` now follows the transformer API of scikit-learn. | |||
In addition, the :meth:`fit` method now validates the input and parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reference yourself as author of the PR: "By :user:`your name <your handle>`".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Reference Issues/PRs
Towards #26482
See also #27724 and #26691 which appear to be stalled.
Closes #27724
Closes #26691
What does this implement/fix? Explain your changes.
This runs the common estimator tests on SparseCoder.
To match the behavior expected by the common tests,
n_components_
andn_features_in_
are changed from properties to attributes initialized in thefit
method.validate_data
is run infit
.Specific
dictionary
arguments for checks are added inPER_ESTIMATOR_CHECK_PARAMS
.To be able to use a specific
dictionary
incheck_set_output_transform
,_yield_instances_for_check
is used intest_set_output_transform
.