-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Add Array API compatibility to mean_absolute_error
#27736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Could you please expand the tests to make sure that we cover the raised exceptions?
More details below.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! One more suggestion for a more informative message. The test needs to be updated accordingly.
BTW, I think that numpy error message would benefit from a similar treatment upstream.
I will make an issue on Also BTW I added a set of tests for |
I launched the Array API tests with pytorch with an MPS device on my local laptop and with pytorch on a machine with a cuda device and all the mean absolute error Array API compliance tests pass. However I observed some failures with cupy:
Here are the details of the tracebacks: =============================================================== FAILURES ===============================================================
______________ test_array_api_compliance[mean_absolute_error-check_array_api_regression_metric-cupy.array_api-None-None] _______________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_regression_metric at 0x7ff1714e4280>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1810: in check_array_api_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1747: in check_array_api_metric
metric_xp = metric(y_true_xp, y_pred_xp, sample_weight=sample_weight)
sklearn/utils/_param_validation.py:214: in wrapper
return func(*args, **kwargs)
sklearn/metrics/_regression.py:214: in mean_absolute_error
output_errors = _average(xp.abs(y_pred - y_true), weights=sample_weight, axis=0)
sklearn/utils/_array_api.py:651: in _average
weights = xp.swapaxes(weights, -1, axis)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.utils._array_api._ArrayAPIWrapper object at 0x7ff1b5d40730>, name = 'swapaxes'
def __getattr__(self, name):
> return getattr(self._namespace, name)
E AttributeError: module 'cupy.array_api' has no attribute 'swapaxes'
sklearn/utils/_array_api.py:203: AttributeError
_____________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy-None-None] ______________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1754: in check_array_api_metric
assert_allclose(
sklearn/utils/_testing.py:284: in assert_allclose
actual, desired = np.asanyarray(actual), np.asanyarray(desired)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.
cupy/_core/core.pyx:1475: TypeError
________ test_array_api_compliance[mean_absolute_error-check_array_api_multioutput_regression_metric-cupy.array_api-None-None] _________
metric = <function mean_absolute_error at 0x7ff1b90f0b80>, array_namespace = 'cupy.array_api', device = None, dtype = None
check_func = <function check_array_api_multioutput_regression_metric at 0x7ff1714e4310>
@pytest.mark.parametrize(
"array_namespace, device, dtype", yield_namespace_device_dtype_combinations()
)
@pytest.mark.parametrize("metric, check_func", yield_metric_checker_combinations())
def test_array_api_compliance(metric, array_namespace, device, dtype, check_func):
> check_func(metric, array_namespace, device, dtype)
sklearn/metrics/tests/test_common.py:1873:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1831: in check_array_api_multioutput_regression_metric
check_array_api_metric(
sklearn/metrics/tests/test_common.py:1752: in check_array_api_metric
metric_xp = xp.asarray(metric_xp, device="cpu")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
obj = Array([0.5, 1. ], dtype=float32)
def asarray(
obj: Union[
Array,
bool,
int,
float,
NestedSequence[bool | int | float],
SupportsBufferProtocol,
],
/,
*,
dtype: Optional[Dtype] = None,
device: Optional[Device] = None,
copy: Optional[bool] = None,
) -> Array:
"""
Array API compatible wrapper for :py:func:`np.asarray <numpy.asarray>`.
See its docstring for more information.
"""
# _array_object imports in this file are inside the functions to avoid
# circular imports
from ._array_object import Array
_check_valid_dtype(dtype)
if device is not None and not isinstance(device, _Device):
> raise ValueError(f"Unsupported device {device!r}")
E ValueError: Unsupported device 'cpu'
/data/parietal/store3/work/ogrisel/mambaforge/envs/dev/lib/python3.10/site-packages/cupy/array_api/_creation_functions.py:59: ValueError
======================================================= short test summary info ======================================================== |
For the swapaxes problem, we should report the problem upstream (if not already existing) and mark it the corresponding test case as xfail in the scikit-learn test suite with a link to the upstream issue. For information, I am running the 12.2.0 version of cupy which is the latest available on conda-forge apparently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following might fix the other 2 failures with cupy.
You can probably use google colab or kaggle code if you need a machine with a cuda device to launch the tests with cupy.
Thank you very much for checking I've added the suggestions, but I still have to test if |
Looks like the tests in the red pipeline actually passed, but the publishing step failed, not related to the PR. |
I tested with FYI it seems there is an issue with
Testing with
I see the same error when running the tests on the new |
Since we don't have a cupy CI yet, we can just assume an array-api-compat version with the fix will be out before it bothers our CI :) But if that's the case, maybe we can spend some effort to skip cupy tests when array_api_compat's version is one of the known versions with the bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise. Thanks for the PR.
Hey @glemaitre sorry for the ping, I noticed you are the other reviewer :) I was wondering if there is anything I should change in this PR |
Maybe @OmarManzoor would be interested in reviewing this one as well :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @EdAbati. Could you kindly resolve the conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments regarding the change log, otherwise looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @EdAbati!
Thank you both 🙂 |
Reference Issues/PRs
Towards #26024
What does this implement/fix? Explain your changes.
It makes the
mean_absolute_error
implementation compatible and tested with the Array API.Not sure if it's the best approach, but I have converted the
np.average
implementation so that it is compatible with the Array API. Is there a better way? (I will fix add the tests to make codecov happy, if you agree to have the_average
function)cc @betatim @ogrisel