-
-
Notifications
You must be signed in to change notification settings - Fork 26k
TST Common test preserve dtype #18054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST Common test preserve dtype #18054
Conversation
sklearn/utils/estimator_checks.py
Outdated
# it's not possible to preserve dtypes in transform with clustering | ||
# same for MissingIndicator | ||
if ( | ||
not isinstance(transformer, (ClusterMixin, MissingIndicator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be handled with a preserves_dtype
tag that is empty?
afb17fa
to
43af513
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks. Minor comments otherwise LGTM.
To solve the remaining failure, we should first merge: #18045 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks LGTM, pending #18045 being merged to fix the last failing test.
@rth my bad, I did a typo in the tag name and so the test was run. It should pass now |
Thanks! +1 for me. @thomasjpfan do you have any other comments? |
sklearn/utils/estimator_checks.py
Outdated
X_trans, | ||
X_trans_float_64, | ||
atol=1e-10, rtol=5e-4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have PRs where we had to adjust these tolerances for different environments. Are we okay with running into this issue in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, this part of the test has 2 issues:
- precision issue as you mentioned;
- dtype issue (if
np.float32
is the only supported dtype).
However, we caught a bug using this. So it is kind of useful.
To solve the first issue, I think that we could design a decorator that could adjust the tolerance depending on the platform. For the second case, I think that we should avoid this testing if np.float64
is not the default dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we run this with git tag [arm64]
to see if this passes there?
After this gets merged, it would be interesting to see this run on conda-forge to see if it passes there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick pass, mostly looks good
Any other comments @NicolasHug @thomasjpfan ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
3 approvals, I am going to merge :) |
great !
|
Co-authored-by: Sebastian Kießling <sebkies@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
closes #16290
helps toward #11000
Take over #16290 and introduce a common test to check if estimators preserve dtypes