-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
array API support for mean_absolute_percentage_error #29300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
array API support for mean_absolute_percentage_error #29300
Conversation
CUDA: https://colab.research.google.com/drive/1SKzB8XaT2j_j4j7S-w4W6Du3DCo1I2B2?usp=sharing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the following, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the merge conflicts are resolved.
a65b388
to
b5bc817
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment regarding mps
.
I also noticed that other tests seems to fail also in main
:
FAILED sklearn/metrics/cluster/tests/test_supervised.py::test_entropy_array_api[torch-mps-float32] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-torch-mps-float32] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-torch-mps-float32] - ValueError: unrecognized csr_matrix constructor usage
is anyone having a look at these? (or I can try to help)
sklearn/metrics/_regression.py
Outdated
epsilon = xp.asarray(xp.finfo(xp.float64).eps, dtype=xp.float64) | ||
y_true_abs = xp.asarray(xp.abs(y_true), dtype=xp.float64) | ||
mape = xp.asarray(xp.abs(y_pred - y_true), dtype=xp.float64) / xp.where( | ||
epsilon < y_true_abs, y_true_abs, epsilon | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float64
does not work with Torch on mps
unfortunately. One way to get the max supported float precision could be xp.asarray(0.0).dtype
as this function does . Can you think of a better way?
I might have missed something, why do we need cast to float64 each time we run abs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. The original code did not upcast to float64
so we should not either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed something, why do we need cast to float64 each time we run
abs
?
btw the casting is for xp.asarray
, not abs
@EdAbati feel free to have a look and open a PR to fix those if you have time. |
@EmilyXinyi I merged |
sklearn/metrics/_regression.py
Outdated
epsilon = xp.asarray(xp.finfo(xp.float64).eps, dtype=xp.asarray(0.0).dtype) | ||
y_true_abs = xp.asarray(xp.abs(y_true), dtype=xp.asarray(0.0).dtype) | ||
mape = xp.asarray(xp.abs(y_pred - y_true), dtype=xp.asarray(0.0).dtype) / xp.where( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather use the floating point dtype used in the regressor's predictions than a device dependent dtype.
epsilon = xp.asarray(xp.finfo(xp.float64).eps, dtype=xp.asarray(0.0).dtype) | |
y_true_abs = xp.asarray(xp.abs(y_true), dtype=xp.asarray(0.0).dtype) | |
mape = xp.asarray(xp.abs(y_pred - y_true), dtype=xp.asarray(0.0).dtype) / xp.where( | |
epsilon = xp.asarray(xp.finfo(xp.float64).eps, dtype=y_pred.dtype) | |
y_true_abs = xp.asarray(xp.abs(y_true), dtype=y_pred.dtype) | |
mape = xp.asarray(xp.abs(y_pred - y_true), dtype=y_pred.dtype) / xp.where( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using y_pred.dtype
can't get past the test cases and I think this is because y_pred.dtype
could be array_api_strict.int64
, but only floating-point types are allowed in __truediv__
(which means division, I believe). As well, MAPE behaves likes a symmetric function instead of asymmetric if we use y_pred.dtype
, which I suspect is due to the reduced accuracy in cases where dtype
is int64
.
I am not sure if there is a better way to approach this... Any suggestions would be greatly appreciated, thank you!! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a function defined _find_matching_floating_dtype
in the array api utils. Maybe you could use that to get the float dtype once and then use it in all three places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @OmarManzoor I have made the changes accordingly
b9c8c7c
to
ddebb21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM. Thanks @EmilyXinyi
FYI I tested the CUDA CI label after #29456 was merged and it looks like this is working fine:
|
@lesteve The CUDA CI just failed. Isn't array-api-strict updated to the latest version? |
I think you can ignore the GPU failures for now, I was mostly testing to make sure setting the label did trigger the GPU CI. The versions are all in the lock-file and indeed currently array-api-strict still 1.1.1 scikit-learn/build_tools/github/pylatest_conda_forge_cuda_array-api_linux-64_conda.lock Line 249 in 409d187
FYI trying to update the lock-file to the latest version, there seems to be some issue but I guess someone needs to look into it at one point: #29373 (or wait until Monday morning European time so that the lock-file PR is updated and cross our fingers). |
So should we update and use maximum or should we keep the current code? In my opinion I think we can work considering the latest version and fix the CI whenever possible. What do you think? |
Yes you should ignore the GPU CI failures in this PR definitely, and someone should look at the CI issues in #29373 at one point. |
So I am not going to pretend I understand what is going on and there is no huge rush on fixing this at all, but it seems like this PR broke the doc-min-dependencies build (i.e. we run the doc build with our minimal dependencies to check that all the example run) ... See Build log Here is the error, it does look related to
|
Yes I saw this too. How come this did not error out in the PR? Seems to be a case in this example: |
The doc build in PR only runs examples that have changes in the PR, if you want a full doc build you can push a commit with To reproduce, I guess the best bet (if you are on Linux) is to use the doc-min-dependencies lock-file, see the quick doc I recently added. |
I am on a Mac with M2 chip. |
OK, you can still to use the associated environment file (see doc mentioned above) and see whether you can reproduce. A quick guess would be that we use old numpy/pandas/something else version and that the logic needs to be adapted ... |
I am also trying to reproduce the problem. I am on Mac with Intel i5 cores |
I was not able to configure the environment on my mac using the instructions that @lesteve mentioned but maybe you can create a conda environment using the lock file or the environment file. |
Reference Issues/PRs
Towards #26024
What does this implement/fix? Explain your changes.
Adds array API support for mean_absolute_percentage_error
Any other comments?
Keep this as draft until I add PR number and CUDA is green
Failing CI: I ran the command that triggers the failing test cases locally (
pytest --durations=20 --junitxml=test-data.xml --pyargs sklearn
) but they all pass. I am not sure what contributes to the difference in behaviour between our pipeline and my local tests...