-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST refactor instance generation and parameter setting #29702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that instance generation is a bit messy. I'm okay with defining a new module to hold them all.
) | ||
from sklearn.svm import SVC, SVR, LinearSVC, LinearSVR, NuSVC, NuSVR, OneClassSVM | ||
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor | ||
from sklearn.utils import all_estimators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that importing all_estimators
does not cause a circular dependency.
A few other PRs somewhat depend on this one, would be nice to move it forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already a net improvement. LGTM
# than `LinearRegression` if we don't fix `min_samples` parameter. | ||
# For common test, we can enforce using `LinearRegression` that | ||
# is the default estimator in `RANSACRegressor` instead of `Ridge`. | ||
if issubclass(Estimator, RANSACRegressor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this PR because it'll go sideways but I think that we should have a small subsequent PR where we should specify as much as possible the parameter below into the TEST_PARAM
and make sure that we call the _set_checking_parameters
. This would be more consistent with the rest and reduce the number of places that we set parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll like only to add a couple of docstrings for the future me.
Also, I would like to have a subsequent PR to refactor _construct_instance
such that it leverage TEST_PARAMS
instead of doing its internal own business.
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Enabling auto-merge. Let's try to get a new machine with another processor for the Debian build to pass. |
This PR refactors instance generation out of other files, and also fixes #16311 by being explicit about which parameters to set for each estimator.
As a part of working on tests, I'm trying to keep each PR rather small for them to be easy to review.
I'm not 100% happy with the
utils/_test_common/instance_generator.py
path though.cc @glemaitre @adam2392 @thomasjpfan