-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Test determinism of estimators #7270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Created a test that fits two copies of each estimators and compares their predictions on unseen data.
sklearn/tests/test_common.py
Outdated
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.5) | ||
|
||
for name, Estimator in all_estimators(type_filter=['classifier', | ||
'regressor']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformers and unsupervised still to do
sklearn/tests/test_common.py
Outdated
est1.fit(X_train, y_train) | ||
est2.fit(X_train, y_train) | ||
|
||
assert_array_almost_equal(est1.predict(X_test), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be assert_array_equal
instead of assert_array_almost_equal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for classifiers and regressors, for the latter we need almost equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But can we call deterministic a method that does not give always the exact same result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a machine with floating point precision? I would.
Note that for CV estimators ( |
Interesting, it seems except for |
Do we have good reason to believe there are multiple estimators that should fail such a test in scikit-learn? In terms of the specifics, equivalence of You might also consider whether some datasets are more likely to elicit variation than others. For example, I'd consider using a dataset with a random target so that a meaningful classifier is unlikely learnable. |
y_train = np.reshape(y_train, (-1, 1)) | ||
y_test = np.reshape(y_test, (-1, 1)) | ||
|
||
needs_state = 'random_state' in signature(Estimator.__init__).parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use get_params()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You end up with something like needs_state = 'random_state' in Estimator().get_params()
is that really nicer than inspecting the __init__
arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a helper set_random_state
in the estimator_checks.
I should add: without something like |
Given @TomDLT's comment about the *CV estimators not passing on I like the idea on the random dataset. |
The other thing that comes to mind here is that we may have false negatives On 29 August 2016 at 22:05, Tim Head notifications@github.com wrote:
|
Starting with the different solvers that may exists in one estimator. |
On 08/29/2016 06:05 AM, Tom Dupré la Tour wrote:
|
On 08/29/2016 08:08 AM, Joel Nothman wrote:
If we would "just" describe the legal space of all parameters in a more |
IMHO having this test is useful, even if it only covers the default argument case (perfection being the enemy of good or some such).
|
It's pretty weird that Regression uses stratified CV...? And certainly |
LinearRegressionCV uses stratified? That's a bug. |
I think several comments here are red herrings.
Our default Do we want to continue on with this? Marking as Stalled. |
@jnothman: I am interested to carry on if this is relevant. |
Reference Issue
Fixes #7139
What does this implement/fix? Explain your changes.
Fit two instances of the same estimator on a toy problem. Use both to predict on an unseen subset of data and compare predictions. If an estimator has a
random_state
argument provide it, if not then not.Any other comments?
There are a few estimators that I am skipping at the moment because they fail the test. Most blatantly are not deterministic but two or so have a different error.
Unsure about the location, should this be a check in
utils/estimator_checks.py
instead?check_deterministic
inutils/estimator_checks.py
HuberRegressor
,LogisticRegressionCV
,LinearRegression
,RANSACRegressor
fail?RadiusNeighborsClassifier
fail? Failure mode is different to the above