Fix tests for numpy 2 and array api compat #29436

ogrisel · 2024-07-08T15:53:02Z

Fixes #29396.

Update the array API tests and the _average private helper for compatibility with NumPy 2.0.0.

github-actions · 2024-07-08T15:54:24Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 0ee1a21. Link to the linter CI: here}

… 2 compat

ogrisel · 2024-07-08T16:21:58Z

I triggered a CUDA CI run here:

https://github.com/scikit-learn/scikit-learn/actions/runs/9843260479/job/27174070905

EDIT: green!

The tests pass locally with numpy 2.0.0 and array-api-strict 2.0.1.

ogrisel · 2024-07-08T16:24:06Z

I marged "No changelog needed" but this is debatable as our _average implementation now mimics NumPy 2's np.average when raising exception on invalid inputs rather than what was done for NumPy 1.

lesteve · 2024-07-09T07:47:25Z

The tests pass locally with numpy 2.0.0 and array-api-strict 2.0.1.

Yep, I confirm this as well.

I marged "No changelog needed" but this is debatable as our _average implementation now mimics NumPy 2's np.average when raising exception on invalid inputs rather than what was done for NumPy 1.

I guess I would add a changelog because the exception type does change when weights is 1d. So this may affect a (very) small fraction of downstream users. On a similar note I have been wondering on when is a changelog needed and asked in our Discord to get people feelings about this: https://discord.com/channels/731163543038197871/1260125820379463720/1260134931783225347.

…ip_openblas_pandas

build_tools/update_environments_and_lock_files.py

ogrisel · 2024-07-09T09:06:28Z

I guess I would add a changelog because the exception type does change when weights is 1d. So this may affect a (very) small fraction of downstream users.

I tried to come up with a changelog entry but found it too awkward to explain: this change is in a private helper that will impact most scoring metrics (but not all) only when array API dispatch is enabled (and only for very weird input shapes). Furthermore the change is similar from a change from numpy 1 to numpy 2, so I think very few people will be impacted (upgrading existing code bases with scikit-learn using array API on code that only works with numpy 1 and has not been updated for numpy 2 yet).

So I would rather not add a verbose and hard to understand changelog entry for this PR. But if you disagree I can still do it.

lesteve · 2024-07-09T09:21:53Z

So I would rather not add verbose and hard to understand changelog entry for this PR. But if you disagree I can still do it.

Seems reasonable not to add a changelog indeed, given the complexities involved.

betatim · 2024-07-09T11:58:30Z

Only saw the test failure after approving it. Looks like finfo() now does not accept None as dtype anymore. However, for now it still works and returns information about the default dtype. I think we could use np.float64 instead of None as the type doesn't seem to depend on which namespace is being used.

ogrisel · 2024-07-09T12:10:30Z

I will fix the warnings asap.

…umPy's default floating point precision level

sklearn/utils/tests/test_array_api.py

ogrisel · 2024-07-09T13:21:28Z

I triggered a new CUDA CI run here:

https://github.com/scikit-learn/scikit-learn/actions/runs/9857870880

EDIT: they are green.

ogrisel · 2024-07-09T13:29:44Z

@lesteve shall we enable auto-merge on this? This would make it saner to help review the many pending array API PRs.

lesteve · 2024-07-09T14:28:26Z

At a quick glance the diff looks fine, two issues:

pylatest_conda_forge_mkl is red, looks like an issue with device in the numpy<2 device handling, device is as string 'cpu' rather than a function:

        if np_version < parse_version("2.0.0") or np_version >= parse_version("2.1.0"):
            # NumPy 2.0 has a problem with the device attribute of scalar arrays:
            # https://github.com/numpy/numpy/issues/26850
>           assert device(array_xp) == device(result)
E           TypeError: 'str' object is not callable

array      = array([[ 0,  3,  0],
       [ 2, -1,  0],
       [ 0,  0,  0],
       [ 9,  8,  7],
       [ 4,  0,  5]])
array_namespace = 'torch'
array_xp   = tensor([[ 0,  3,  0],
        [ 2, -1,  0],
        [ 0,  0,  0],
        [ 9,  8,  7],
        [ 4,  0,  5]])
axis       = -2
csr_container = <class 'scipy.sparse._csr.csr_array'>
device     = 'cpu'

not a blocker, but something I have noticed is that the pylatest_pip_openblas_pandas build is a lot longer than previously. More than 50 minutes in this PR (see build log) vs 27 minutes in the last commmit on main (see build log). Do you expect array API tests to have such an impact? This seem to have started happening in the first commit introducing array-api tests 57970d6

ogrisel · 2024-07-09T14:44:43Z

I will revert 021e07b to avoid the device problem. EDIT: actually this is a fixable problem, no need to revert.

I suspect that the test time is related to the Python version. Let me re-pin to 3.9 for now with a comment and see of the new CI run confirms this. If this is the case we should investigate separately if we can reproduce. Maybe it's related to coverage tracing performance in recent Python versions?

ogrisel · 2024-07-09T15:47:45Z

We are back to 27 min with Python 3.9...

I think this is good to merge as is but we should definitely investigate the cause of the slowness with Python 3.12.

lesteve · 2024-07-09T15:55:41Z

Merging, thanks!

The slow CI is weird but it does remind me something (not sure at all this was related but I can try to find it again ...)

ogrisel · 2024-07-09T16:13:19Z

Thanks for the reviews.

lesteve · 2024-07-09T19:45:12Z

Found the slow CI with Python 3.12 I had in mind. At the time it was seen when moving the scipy-dev build to Python 3.12: #28383.

The work-around was to move dataset download to another build so this may not be related.

I guess it is easy to test your coverage idea by disabling coverage in the CI build ...
There seems to be some reports going this direction e.g. nedbat/coveragepy#1665. Setting COVERAGE_CORE=sysmon may be a way to speed that up on Python 3.12 which seems to do the trick: nedbat/coveragepy#1665 (comment)

ogrisel added 2 commits July 8, 2024 17:40

Fix test_get_namespace_ndarray_with_dispatch for numpy 2

c861ad0

Skip problematic section of test_count_nonzero with numpy 2.0

ec89f79

github-actions bot added the module:utils label Jul 8, 2024

ogrisel mentioned this pull request Jul 8, 2024

Array API tests fail on main #29396

Closed

ogrisel added 3 commits July 8, 2024 17:59

Skip problematic section of test_average with numpy 2.0

15cc1d6

Fix adapt _average and test_average_raises_with_wrong_dtype for NumPy…

7923c20

… 2 compat

Merge branch 'main' into fix-test-for-numpy-2-and-array-api-compat

d9cf04c

ogrisel added the No Changelog Needed label Jul 8, 2024

ogrisel marked this pull request as ready for review July 8, 2024 16:22

ogrisel changed the title ~~Fix test for numpy 2 and array api compat~~ Fix tests for numpy 2 and array api compat Jul 8, 2024

Merge branch 'main' into fix-test-for-numpy-2-and-array-api-compat

6a15fde

MAINT install array-api-{strict,compat} without PyTorch on pylatest_p…

57970d6

…ip_openblas_pandas

ogrisel commented Jul 9, 2024

View reviewed changes

build_tools/update_environments_and_lock_files.py Show resolved Hide resolved

This was referenced Jul 9, 2024

array API support for mean_poisson_deviance #29227

Merged

ENH Array API support for euclidean_distances and rbf_kernel #29433

Merged

betatim approved these changes Jul 9, 2024

View reviewed changes

TST remove numpy deprecation warnings in _atol_for_type by assuming N…

5074d55

…umPy's default floating point precision level

betatim reviewed Jul 9, 2024

View reviewed changes

sklearn/utils/tests/test_array_api.py Outdated Show resolved Hide resolved

Avoid using getattr

021e07b

ogrisel added 2 commits July 9, 2024 16:52

Avoid shadowing the device function with a parameterized test argument

d142500

Pin back Python 3.9 for now

0ee1a21

lesteve approved these changes Jul 9, 2024

View reviewed changes

lesteve merged commit a922568 into scikit-learn:main Jul 9, 2024
30 checks passed

ogrisel deleted the fix-test-for-numpy-2-and-array-api-compat branch July 9, 2024 16:12

ogrisel restored the fix-test-for-numpy-2-and-array-api-compat branch July 9, 2024 16:12

ogrisel deleted the fix-test-for-numpy-2-and-array-api-compat branch July 9, 2024 16:12

lesteve mentioned this pull request Jul 10, 2024

CI Update pylatest-pip-openblas-pandas build to Python 3.11 #29444

Merged

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 9, 2024

Fix tests for numpy 2 and array api compat (scikit-learn#29436)

aa1e4cf

glemaitre pushed a commit that referenced this pull request Sep 11, 2024

Fix tests for numpy 2 and array api compat (#29436)

2534289

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tests for numpy 2 and array api compat #29436

Fix tests for numpy 2 and array api compat #29436

ogrisel commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Jul 8, 2024 •

edited

Loading

ogrisel commented Jul 8, 2024 •

edited

Loading

ogrisel commented Jul 8, 2024

lesteve commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024 •

edited

Loading

lesteve commented Jul 9, 2024

betatim commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024

ogrisel commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024 •

edited

Loading

Fix tests for numpy 2 and array api compat #29436

Fix tests for numpy 2 and array api compat #29436

Conversation

ogrisel commented Jul 8, 2024 • edited Loading

github-actions bot commented Jul 8, 2024 • edited Loading

✔️ Linting Passed

ogrisel commented Jul 8, 2024 • edited Loading

ogrisel commented Jul 8, 2024

lesteve commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 9, 2024 • edited Loading

lesteve commented Jul 9, 2024

betatim commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 9, 2024

ogrisel commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024

ogrisel commented Jul 9, 2024

lesteve commented Jul 9, 2024 • edited Loading

ogrisel commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Jul 8, 2024 •

edited

Loading

ogrisel commented Jul 8, 2024 •

edited

Loading

lesteve commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024 •

edited

Loading

betatim commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024 •

edited

Loading

lesteve commented Jul 9, 2024 •

edited

Loading

ogrisel commented Jul 9, 2024 •

edited

Loading

lesteve commented Jul 9, 2024 •

edited

Loading