-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Use Array API in r2_score
#27102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Array API in r2_score
#27102
Conversation
r2_score
denominator = ( | ||
weight * (y_true - np.average(y_true, axis=0, weights=sample_weight)) ** 2 | ||
).sum(axis=0, dtype=np.float64) | ||
numerator = xp.sum(weight * (y_true - y_pred) ** 2, axis=0, dtype=xp.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the end goal is to compute a scalar and we want float64
precision but some namespace / device combinations might not support this (e.g. pytorch / MPS only support float32 operations), I think we need to move to CPU before computing the float64 sum.
See the following for a similar case:
Note that this will make the computation slightly less efficient for devices that do support float64 arithmetic, but maybe this is a pragmatic fix.
Alternatively, we could attempt to write a utility in _array_api.py
that dynamically inspects if a given namespace / device combo can support float64 operation (and cache the result) so as to trigger the to device="cpu"
copy only in cases when it's actually needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this, but I ran into issues with CuPy. Apparently, device="cpu" does not work, see the below error message. I made a hacky workaround for this that keeps data on the GPU only for CuPy, but I'm not happy with it. Any ideas?
The hacky workaround in question:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need a new entry in |
@elindgren I merged #27137 and synced your branch with |
Thanks! Sorry for the radio silence, picking this back up. |
Co-authored-by: Tim Head <betatim@gmail.com>
Co-authored-by: Tim Head <betatim@gmail.com>
Some Array API compatible libraries do not have a device called 'cpu'. Instead we try and detect the lib+device combination that does not support float64.
I've opened a follow-up PR at #27904 (of course it keeps your commits and includes your name in the changelog)
Yet you've done well and your work has been very useful. I can also see now how awful of a maze it is when beginning to tackle dtype and device management. Can't say for sure the solutions I'm proposing in #27904 are more likeable. |
…nto ENH/r2_score_array_api
… rather than from_dlpack
…nto ENH/r2_score_array_api
…nto ENH/r2_score_array_api
…nto ENH/r2_score_array_api
…se fast code paths do not generalized to nd inputs
Closing this PR in favor of #27904 which includes the original commits of this PR and credits the work of @elindgren in the changelog. I think we are converging on a consensus. |
Reference Issues/PRs
One of the items outlined in #26024.
What does this implement/fix? Explain your changes.
Migrates
r2_score
to use the Array API as outlined in #26024.This PR also introduces the function
_average
that mimics the functionality ofnp.average
for weighted averages, as that is not in the Array API spec._average
can be found underutils/_array_api.py
.Any other comments?
None