Skip to content

TST refactor instance generation and parameter setting #29702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 4, 2024

Conversation

adrinjalali
Copy link
Member

This PR refactors instance generation out of other files, and also fixes #16311 by being explicit about which parameters to set for each estimator.

As a part of working on tests, I'm trying to keep each PR rather small for them to be easy to review.

I'm not 100% happy with the utils/_test_common/instance_generator.py path though.

cc @glemaitre @adam2392 @thomasjpfan

@adrinjalali adrinjalali added the Developer API Third party developer API related label Aug 22, 2024
Copy link

github-actions bot commented Aug 22, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: ecb01b6. Link to the linter CI: here

thomasjpfan
thomasjpfan previously approved these changes Aug 22, 2024
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that instance generation is a bit messy. I'm okay with defining a new module to hold them all.

)
from sklearn.svm import SVC, SVR, LinearSVC, LinearSVR, NuSVC, NuSVR, OneClassSVM
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.utils import all_estimators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that importing all_estimators does not cause a circular dependency.

@thomasjpfan thomasjpfan dismissed their stale review August 22, 2024 20:42

I have a comment to be addressed

@adrinjalali
Copy link
Member Author

A few other PRs somewhat depend on this one, would be nice to move it forward.

@adam2392 adam2392 self-requested a review August 26, 2024 14:13
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already a net improvement. LGTM

@glemaitre glemaitre self-requested a review September 3, 2024 09:01
# than `LinearRegression` if we don't fix `min_samples` parameter.
# For common test, we can enforce using `LinearRegression` that
# is the default estimator in `RANSACRegressor` instead of `Ridge`.
if issubclass(Estimator, RANSACRegressor):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR because it'll go sideways but I think that we should have a small subsequent PR where we should specify as much as possible the parameter below into the TEST_PARAM and make sure that we call the _set_checking_parameters. This would be more consistent with the rest and reduce the number of places that we set parameters.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll like only to add a couple of docstrings for the future me.

Also, I would like to have a subsequent PR to refactor _construct_instance such that it leverage TEST_PARAMS instead of doing its internal own business.

adrinjalali and others added 5 commits September 3, 2024 13:30
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
@glemaitre glemaitre enabled auto-merge (squash) September 3, 2024 14:38
@glemaitre
Copy link
Member

Enabling auto-merge. Let's try to get a new machine with another processor for the Debian build to pass.

@glemaitre glemaitre merged commit 3cb7b58 into scikit-learn:main Sep 4, 2024
28 checks passed
@adrinjalali adrinjalali deleted the test/param_set branch September 4, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Developer API Third party developer API related No Changelog Needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

check_estimator assumes too much about 3rd-party estimators
3 participants